-
Notifications
You must be signed in to change notification settings - Fork 107
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix windows path (#660) #673
Conversation
This PR breaks backwards compatibility for databrickslabs/ucx downstream. See build logs for more details. Running from downstreams #189 |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #673 +/- ##
==========================================
+ Coverage 57.69% 57.70% +0.01%
==========================================
Files 48 48
Lines 33079 33085 +6
==========================================
+ Hits 19084 19091 +7
+ Misses 13995 13994 -1 ☔ View full report in Codecov by Sentry. |
3833e11
to
16dc404
Compare
databricks/sdk/mixins/files.py
Outdated
@@ -267,7 +267,8 @@ def __repr__(self) -> str: | |||
class _Path(ABC): | |||
|
|||
def __init__(self, path: str): | |||
self._path = pathlib.Path(str(path).replace('dbfs:', '').replace('file:', '')) | |||
self._path = pathlib.PurePosixPath(str(path).replace('dbfs:', '').replace('file:', '')) | |||
self._system_path = pathlib.Path(self._path) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It'd be good to add a comment explaining why we need self._path
to be PurePosixPath
and why we need self._system_path
Can you link the github issue that this PR solves in the PR description? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, let's get @mgyucht to take a look before merging
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution @parthban-db! A couple notes about this PR:
The more I think about this, the more I think we're putting too much implementation detail in one place. I would do the following:
- Change
_Path.__init__
to be an@abstractmethod
and remove its implementation. - Add
__init__
implementations in each of_DbfsPath
,_LocalPath
and_VolumesPath
. In_DbfsPath
and_VolumesPath
, usePurePosixPath
, and in_LocalPath
usePath
.
This way, we don't duplicate thepath
state, and each implementation can choose the specific path type needed for itself.
Second: We should also make sure we add a test case that covers the actual issue reported by the user: when making a directory in DBFS, regardless of the OS, the path separator used in the API call must be /
, not \
.
@edwardfeng-db could you help @parthban-db modify the unit test runners to support testing in Windows as well?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM once you add the volumes test
assert w.dbfs.exists(f'file:{tmp_path}/test_dir') | ||
|
||
|
||
def test_dbfs_exists(config, mocker): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we do this test for volumes as well?
## Changes Improve Changelog by grouping changes and enforce tag in PRs ## Tests - [X] `make test` run locally - [X] `make fmt` applied - [ ] relevant integration tests applied - [X] Recreate old changelog ``` ## 0.30.0 ### Other Changes * Add Windows WorkFlow ([#692](#692)). * Check trailing slash in host url ([#681](#681)). * Fix auth tests for windows. ([#697](#697)). * Remove duplicate ubuntu tests ([#693](#693)). * Support partners in SDK ([#648](#648)). * fix windows path ([#660](#660)) ([#673](#673)). ### API Changes: * Added [w.serving_endpoints_data_plane](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/serving_endpoints_data_plane.html) workspace-level service. * Added `deploy()` and `start()` methods for [w.apps](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/apps.html) workspace-level service. * Added `batch_get()` method for [w.consumer_listings](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/consumer_listings.html) workspace-level service. * Added `batch_get()` method for [w.consumer_providers](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/consumer_providers.html) workspace-level service. * Added `create_schedule()`, `create_subscription()`, `delete_schedule()`, `delete_subscription()`, `get_schedule()`, `get_subscription()`, `list()`, `list_schedules()`, `list_subscriptions()` and `update_schedule()` methods for [w.lakeview](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/lakeview.html) workspace-level service. * Added `query_next_page()` method for [w.vector_search_indexes](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/vector_search_indexes.html) workspace-level service. * Added `databricks.sdk.service.serving.AppDeploymentMode`, `databricks.sdk.service.serving.ModelDataPlaneInfo` and `databricks.sdk.service.serving.StartAppRequest` dataclasses. * Added `databricks.sdk.service.catalog.CatalogIsolationMode` and `databricks.sdk.service.catalog.ListAccountStorageCredentialsResponse` dataclasses. * Added `databricks.sdk.service.dashboards.CreateScheduleRequest`, `databricks.sdk.service.dashboards.CreateSubscriptionRequest`, `databricks.sdk.service.dashboards.CronSchedule`, `databricks.sdk.service.dashboards.DashboardView`, `databricks.sdk.service.dashboards.DeleteScheduleRequest`, `any`, `databricks.sdk.service.dashboards.DeleteSubscriptionRequest`, `any`, `databricks.sdk.service.dashboards.GetScheduleRequest`, `databricks.sdk.service.dashboards.GetSubscriptionRequest`, `databricks.sdk.service.dashboards.ListDashboardsRequest`, `databricks.sdk.service.dashboards.ListDashboardsResponse`, `databricks.sdk.service.dashboards.ListSchedulesRequest`, `databricks.sdk.service.dashboards.ListSchedulesResponse`, `databricks.sdk.service.dashboards.ListSubscriptionsRequest`, `databricks.sdk.service.dashboards.ListSubscriptionsResponse`, `databricks.sdk.service.dashboards.Schedule`, `databricks.sdk.service.dashboards.SchedulePauseStatus`, `databricks.sdk.service.dashboards.Subscriber`, `databricks.sdk.service.dashboards.Subscription`, `databricks.sdk.service.dashboards.SubscriptionSubscriberDestination`, `databricks.sdk.service.dashboards.SubscriptionSubscriberUser` and `databricks.sdk.service.dashboards.UpdateScheduleRequest` dataclasses. * Added `databricks.sdk.service.jobs.PeriodicTriggerConfiguration` and `databricks.sdk.service.jobs.PeriodicTriggerConfigurationTimeUnit` dataclasses. * Added `databricks.sdk.service.marketplace.BatchGetListingsRequest`, `databricks.sdk.service.marketplace.BatchGetListingsResponse`, `databricks.sdk.service.marketplace.BatchGetProvidersRequest`, `databricks.sdk.service.marketplace.BatchGetProvidersResponse`, `databricks.sdk.service.marketplace.ProviderIconFile`, `databricks.sdk.service.marketplace.ProviderIconType` and `databricks.sdk.service.marketplace.ProviderListingSummaryInfo` dataclasses. * Added `databricks.sdk.service.oauth2.DataPlaneInfo` dataclass. * Added `databricks.sdk.service.vectorsearch.QueryVectorIndexNextPageRequest` dataclass. * Added `isolation_mode` field for `databricks.sdk.service.catalog.ExternalLocationInfo`. * Added `max_results` and `page_token` fields for `databricks.sdk.service.catalog.ListCatalogsRequest`. * Added `next_page_token` field for `databricks.sdk.service.catalog.ListCatalogsResponse`. * Added `table_serving_url` field for `databricks.sdk.service.catalog.OnlineTable`. * Added `isolation_mode` field for `databricks.sdk.service.catalog.StorageCredentialInfo`. * Added `isolation_mode` field for `databricks.sdk.service.catalog.UpdateExternalLocation`. * Added `isolation_mode` field for `databricks.sdk.service.catalog.UpdateStorageCredential`. * Added `termination_category` field for `databricks.sdk.service.jobs.ForEachTaskErrorMessageStats`. * Added `on_streaming_backlog_exceeded` field for `databricks.sdk.service.jobs.JobEmailNotifications`. * Added `environment_key` field for `databricks.sdk.service.jobs.RunTask`. * Added `environments` field for `databricks.sdk.service.jobs.SubmitRun`. * Added `dbt_task` and `environment_key` fields for `databricks.sdk.service.jobs.SubmitTask`. * Added `on_streaming_backlog_exceeded` field for `databricks.sdk.service.jobs.TaskEmailNotifications`. * Added `periodic` field for `databricks.sdk.service.jobs.TriggerSettings`. * Added `on_streaming_backlog_exceeded` field for `databricks.sdk.service.jobs.WebhookNotifications`. * Added `provider_summary` field for `databricks.sdk.service.marketplace.Listing`. * Added `service_principal_id` and `service_principal_name` fields for `databricks.sdk.service.serving.App`. * Added `mode` field for `databricks.sdk.service.serving.AppDeployment`. * Added `mode` field for `databricks.sdk.service.serving.CreateAppDeploymentRequest`. * Added `data_plane_info` field for `databricks.sdk.service.serving.ServingEndpointDetailed`. * Added `query_type` field for `databricks.sdk.service.vectorsearch.QueryVectorIndexRequest`. * Added `next_page_token` field for `databricks.sdk.service.vectorsearch.QueryVectorIndexResponse`. * Changed `list()` method for [a.account_storage_credentials](https://databricks-sdk-py.readthedocs.io/en/latest/account/account_storage_credentials.html) account-level service to return `databricks.sdk.service.catalog.ListAccountStorageCredentialsResponse` dataclass. * Changed `isolation_mode` field for `databricks.sdk.service.catalog.CatalogInfo` to `databricks.sdk.service.catalog.CatalogIsolationMode` dataclass. * Changed `isolation_mode` field for `databricks.sdk.service.catalog.UpdateCatalog` to `databricks.sdk.service.catalog.CatalogIsolationMode` dataclass. * Removed `create_deployment()` method for [w.apps](https://databricks-sdk-py.readthedocs.io/en/latest/workspace/apps.html) workspace-level service. * Removed `condition_task`, `dbt_task`, `notebook_task`, `pipeline_task`, `python_wheel_task`, `run_job_task`, `spark_jar_task`, `spark_python_task`, `spark_submit_task` and `sql_task` fields for `databricks.sdk.service.jobs.SubmitRun`. OpenAPI SHA: 7437dabb9dadee402c1fc060df4c1ce8cc5369f0, Date: 2024-06-24 ```
Changes
Changed
pathlib.Path
with thepathlib.PurePosixPath
in/databricks/sdk/mixins/files.py
which always use linux path separators regardless of the OS that it is running on. Fixes (#660)Tests
make test
run locallymake fmt
applied