Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix windows path (#660) #673

Merged
merged 10 commits into from
Jul 3, 2024
Merged

fix windows path (#660) #673

merged 10 commits into from
Jul 3, 2024

Conversation

parthban-db
Copy link
Contributor

@parthban-db parthban-db commented Jun 11, 2024

Changes

Changed pathlib.Path with the pathlib.PurePosixPath in /databricks/sdk/mixins/files.py which always use linux path separators regardless of the OS that it is running on. Fixes (#660)

Tests

  • make test run locally
  • make fmt applied
  • relevant integration tests applied

Copy link

github-actions bot commented Jun 11, 2024

This PR breaks backwards compatibility for databrickslabs/ucx downstream. See build logs for more details.

Running from downstreams #189

@codecov-commenter
Copy link

codecov-commenter commented Jun 12, 2024

Codecov Report

Attention: Patch coverage is 80.00000% with 2 lines in your changes missing coverage. Please review.

Project coverage is 57.70%. Comparing base (455a14c) to head (bc276ae).
Report is 1 commits behind head on main.

Files Patch % Lines
databricks/sdk/mixins/files.py 80.00% 2 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main     #673      +/-   ##
==========================================
+ Coverage   57.69%   57.70%   +0.01%     
==========================================
  Files          48       48              
  Lines       33079    33085       +6     
==========================================
+ Hits        19084    19091       +7     
+ Misses      13995    13994       -1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@parthban-db parthban-db changed the title fix windows path (#661) fix windows path (#660) Jun 12, 2024
@parthban-db parthban-db force-pushed the fix-windows-path branch 2 times, most recently from 3833e11 to 16dc404 Compare June 12, 2024 12:57
@@ -267,7 +267,8 @@ def __repr__(self) -> str:
class _Path(ABC):

def __init__(self, path: str):
self._path = pathlib.Path(str(path).replace('dbfs:', '').replace('file:', ''))
self._path = pathlib.PurePosixPath(str(path).replace('dbfs:', '').replace('file:', ''))
self._system_path = pathlib.Path(self._path)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It'd be good to add a comment explaining why we need self._path to be PurePosixPath and why we need self._system_path

@edwardfeng-db
Copy link
Contributor

Can you link the github issue that this PR solves in the PR description?

Copy link
Contributor

@edwardfeng-db edwardfeng-db left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, let's get @mgyucht to take a look before merging

Copy link
Contributor

@mgyucht mgyucht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution @parthban-db! A couple notes about this PR:

The more I think about this, the more I think we're putting too much implementation detail in one place. I would do the following:

  1. Change _Path.__init__ to be an @abstractmethod and remove its implementation.
  2. Add __init__ implementations in each of _DbfsPath, _LocalPath and _VolumesPath. In _DbfsPath and _VolumesPath, use PurePosixPath, and in _LocalPath use Path.
    This way, we don't duplicate the path state, and each implementation can choose the specific path type needed for itself.

Second: We should also make sure we add a test case that covers the actual issue reported by the user: when making a directory in DBFS, regardless of the OS, the path separator used in the API call must be /, not \.

@edwardfeng-db could you help @parthban-db modify the unit test runners to support testing in Windows as well?

Copy link
Contributor

@mgyucht mgyucht left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM once you add the volumes test

assert w.dbfs.exists(f'file:{tmp_path}/test_dir')


def test_dbfs_exists(config, mocker):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we do this test for volumes as well?

@parthban-db parthban-db requested a review from mgyucht July 2, 2024 15:16
@parthban-db parthban-db added this pull request to the merge queue Jul 3, 2024
Merged via the queue into main with commit f0ac1da Jul 3, 2024
9 checks passed
@parthban-db parthban-db deleted the fix-windows-path branch July 3, 2024 08:31
github-merge-queue bot pushed a commit that referenced this pull request Jul 9, 2024
## Changes
Improve Changelog by grouping changes and enforce tag in PRs

## Tests
- [X] `make test` run locally
- [X] `make fmt` applied
- [ ] relevant integration tests applied
- [X] Recreate old changelog

```

## 0.30.0

### Other Changes

 * Add Windows WorkFlow ([#692](#692)).
 * Check trailing slash in host url ([#681](#681)).
 * Fix auth tests for windows. ([#697](#697)).
 * Remove duplicate ubuntu tests ([#693](#693)).
 * Support partners in SDK ([#648](#648)).
 * fix windows path ([#660](#660)) ([#673](#673)).


### API Changes:

 * Added [w.serving_endpoints_data_plane](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/serving_endpoints_data_plane.html) workspace-level service.
 * Added `deploy()` and `start()` methods for [w.apps](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/apps.html) workspace-level service.
 * Added `batch_get()` method for [w.consumer_listings](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/consumer_listings.html) workspace-level service.
 * Added `batch_get()` method for [w.consumer_providers](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/consumer_providers.html) workspace-level service.
 * Added `create_schedule()`, `create_subscription()`, `delete_schedule()`, `delete_subscription()`, `get_schedule()`, `get_subscription()`, `list()`, `list_schedules()`, `list_subscriptions()` and `update_schedule()` methods for [w.lakeview](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/lakeview.html) workspace-level service.
 * Added `query_next_page()` method for [w.vector_search_indexes](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/vector_search_indexes.html) workspace-level service.
 * Added `databricks.sdk.service.serving.AppDeploymentMode`, `databricks.sdk.service.serving.ModelDataPlaneInfo` and `databricks.sdk.service.serving.StartAppRequest` dataclasses.
 * Added `databricks.sdk.service.catalog.CatalogIsolationMode` and `databricks.sdk.service.catalog.ListAccountStorageCredentialsResponse` dataclasses.
 * Added `databricks.sdk.service.dashboards.CreateScheduleRequest`, `databricks.sdk.service.dashboards.CreateSubscriptionRequest`, `databricks.sdk.service.dashboards.CronSchedule`, `databricks.sdk.service.dashboards.DashboardView`, `databricks.sdk.service.dashboards.DeleteScheduleRequest`, `any`, `databricks.sdk.service.dashboards.DeleteSubscriptionRequest`, `any`, `databricks.sdk.service.dashboards.GetScheduleRequest`, `databricks.sdk.service.dashboards.GetSubscriptionRequest`, `databricks.sdk.service.dashboards.ListDashboardsRequest`, `databricks.sdk.service.dashboards.ListDashboardsResponse`, `databricks.sdk.service.dashboards.ListSchedulesRequest`, `databricks.sdk.service.dashboards.ListSchedulesResponse`, `databricks.sdk.service.dashboards.ListSubscriptionsRequest`, `databricks.sdk.service.dashboards.ListSubscriptionsResponse`, `databricks.sdk.service.dashboards.Schedule`, `databricks.sdk.service.dashboards.SchedulePauseStatus`, `databricks.sdk.service.dashboards.Subscriber`, `databricks.sdk.service.dashboards.Subscription`, `databricks.sdk.service.dashboards.SubscriptionSubscriberDestination`, `databricks.sdk.service.dashboards.SubscriptionSubscriberUser` and `databricks.sdk.service.dashboards.UpdateScheduleRequest` dataclasses.
 * Added `databricks.sdk.service.jobs.PeriodicTriggerConfiguration` and `databricks.sdk.service.jobs.PeriodicTriggerConfigurationTimeUnit` dataclasses.
 * Added `databricks.sdk.service.marketplace.BatchGetListingsRequest`, `databricks.sdk.service.marketplace.BatchGetListingsResponse`, `databricks.sdk.service.marketplace.BatchGetProvidersRequest`, `databricks.sdk.service.marketplace.BatchGetProvidersResponse`, `databricks.sdk.service.marketplace.ProviderIconFile`, `databricks.sdk.service.marketplace.ProviderIconType` and `databricks.sdk.service.marketplace.ProviderListingSummaryInfo` dataclasses.
 * Added `databricks.sdk.service.oauth2.DataPlaneInfo` dataclass.
 * Added `databricks.sdk.service.vectorsearch.QueryVectorIndexNextPageRequest` dataclass.
 * Added `isolation_mode` field for `databricks.sdk.service.catalog.ExternalLocationInfo`.
 * Added `max_results` and `page_token` fields for `databricks.sdk.service.catalog.ListCatalogsRequest`.
 * Added `next_page_token` field for `databricks.sdk.service.catalog.ListCatalogsResponse`.
 * Added `table_serving_url` field for `databricks.sdk.service.catalog.OnlineTable`.
 * Added `isolation_mode` field for `databricks.sdk.service.catalog.StorageCredentialInfo`.
 * Added `isolation_mode` field for `databricks.sdk.service.catalog.UpdateExternalLocation`.
 * Added `isolation_mode` field for `databricks.sdk.service.catalog.UpdateStorageCredential`.
 * Added `termination_category` field for `databricks.sdk.service.jobs.ForEachTaskErrorMessageStats`.
 * Added `on_streaming_backlog_exceeded` field for `databricks.sdk.service.jobs.JobEmailNotifications`.
 * Added `environment_key` field for `databricks.sdk.service.jobs.RunTask`.
 * Added `environments` field for `databricks.sdk.service.jobs.SubmitRun`.
 * Added `dbt_task` and `environment_key` fields for `databricks.sdk.service.jobs.SubmitTask`.
 * Added `on_streaming_backlog_exceeded` field for `databricks.sdk.service.jobs.TaskEmailNotifications`.
 * Added `periodic` field for `databricks.sdk.service.jobs.TriggerSettings`.
 * Added `on_streaming_backlog_exceeded` field for `databricks.sdk.service.jobs.WebhookNotifications`.
 * Added `provider_summary` field for `databricks.sdk.service.marketplace.Listing`.
 * Added `service_principal_id` and `service_principal_name` fields for `databricks.sdk.service.serving.App`.
 * Added `mode` field for `databricks.sdk.service.serving.AppDeployment`.
 * Added `mode` field for `databricks.sdk.service.serving.CreateAppDeploymentRequest`.
 * Added `data_plane_info` field for `databricks.sdk.service.serving.ServingEndpointDetailed`.
 * Added `query_type` field for `databricks.sdk.service.vectorsearch.QueryVectorIndexRequest`.
 * Added `next_page_token` field for `databricks.sdk.service.vectorsearch.QueryVectorIndexResponse`.
 * Changed `list()` method for [a.account_storage_credentials](https://databricks-sdk-py.readthedocs.io/en/latest/account/account_storage_credentials.html) account-level service to return `databricks.sdk.service.catalog.ListAccountStorageCredentialsResponse` dataclass.
 * Changed `isolation_mode` field for `databricks.sdk.service.catalog.CatalogInfo` to `databricks.sdk.service.catalog.CatalogIsolationMode` dataclass.
 * Changed `isolation_mode` field for `databricks.sdk.service.catalog.UpdateCatalog` to `databricks.sdk.service.catalog.CatalogIsolationMode` dataclass.
 * Removed `create_deployment()` method for [w.apps](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/apps.html) workspace-level service.
 * Removed `condition_task`, `dbt_task`, `notebook_task`, `pipeline_task`, `python_wheel_task`, `run_job_task`, `spark_jar_task`, `spark_python_task`, `spark_submit_task` and `sql_task` fields for `databricks.sdk.service.jobs.SubmitRun`.

OpenAPI SHA: 7437dabb9dadee402c1fc060df4c1ce8cc5369f0, Date: 2024-06-24
```
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants