You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Description
If I try to check if a dbfs folder exists on databricks instance, with code like client.dbfs.exists('/mnt'), I get error "InvalidParameterValue: Path must be absolute: \mnt".
The issue happens on windows only.
Reproduction
On Windows start ipython
use following code: import logging import argparse from databricks.sdk import WorkspaceClient from databricks.sdk.service.compute import DataSecurityMode, RuntimeEngine, Library from datetime import timedelta import os client = WorkspaceClient() client.dbfs.exists('/mnt')
Error message: InvalidParameterValue: Path must be absolute: \mnt
Expected behavior
"true" or "false", depending on whether the path exists on databricks.
Is it a regression?
I didn't try.
Debug Logs
'Cell In[11], line 1
----> 1 client.dbfs.exists('/mnt')
File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\mixins\files.py:572, in DbfsExt.exists(self, path)
570 """If file exists on DBFS"""
571 p = self._path(path)
--> 572 return p.exists()
File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\retries.py:54, in retried..decorator..wrapper(*args, **kwargs)
50 retry_reason = f'{type(err).name} is allowed to retry'
52 if retry_reason is None:
53 # raise if exception is not retryable
---> 54 raise err
56 logger.debug(f'Retrying: {retry_reason} (sleeping ~{sleep}s)')
57 clock.sleep(sleep + random())
File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\retries.py:33, in retried..decorator..wrapper(*args, **kwargs)
31 while clock.time() < deadline:
32 try:
---> 33 return func(*args, **kwargs)
34 except Exception as err:
35 last_err = err
File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\core.py:243, in ApiClient._perform(self, method, path, query, headers, body, raw, files, data)
239 if not response.ok: # internally calls response.raise_for_status()
240 # TODO: experiment with traceback pruning for better readability
241 # See https://stackoverflow.com/a/58821552/277035
242 payload = response.json()
--> 243 raise self._make_nicer_error(response=response, **payload) from None
244 # Private link failures happen via a redirect to the login page. From a requests-perspective, the request
245 # is successful, but the response is not what we expect. We need to handle this case separately.
246 if _is_private_link_redirect(response):
InvalidParameterValue: Path must be absolute: \mnt'
Other Information
OS: Windows
Version: 10 Enterprise
databricks-sdk 0.28.0
databricks-api 0.9.0
Additional context
If I manually add the following row in Lib\site-packages\databricks\sdk\core.py , the code returns correct 'true' value: def do(self, method: str, path: str, query: dict = None, headers: dict = None, body: dict = None, raw: bool = False, files=None, data=None, response_headers: List[str] = None) -> Union[dict, BinaryIO]: logger.warning(f"AAA path:{path}; method:{method}; query:{query}; headers:{headers}; body:{body}; raw:{raw}; files:{files}; data:{data}; response_headers:{response_headers}") if 'path' in query: query['path'] = query['path'].replace('\\', '/')
The text was updated successfully, but these errors were encountered:
## Changes
<!-- Summary of your changes that are easy to understand -->
Changed `pathlib.Path` with the `pathlib.PurePosixPath` in
`/databricks/sdk/mixins/files.py` which always use linux path separators
regardless of the OS that it is running on. Fixes (#660)
## Tests
<!--
How is this tested? Please see the checklist below and also describe any
other relevant tests
-->
- [x] `make test` run locally
- [x] `make fmt` applied
- [ ] relevant integration tests applied
Description
If I try to check if a dbfs folder exists on databricks instance, with code like client.dbfs.exists('/mnt'), I get error "InvalidParameterValue: Path must be absolute: \mnt".
The issue happens on windows only.
Reproduction
import logging import argparse from databricks.sdk import WorkspaceClient from databricks.sdk.service.compute import DataSecurityMode, RuntimeEngine, Library from datetime import timedelta import os client = WorkspaceClient() client.dbfs.exists('/mnt')
InvalidParameterValue: Path must be absolute: \mnt
Expected behavior
"true" or "false", depending on whether the path exists on databricks.
Is it a regression?
I didn't try.
Debug Logs
'Cell In[11], line 1
----> 1 client.dbfs.exists('/mnt')
File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\mixins\files.py:572, in DbfsExt.exists(self, path)
570 """If file exists on DBFS"""
571 p = self._path(path)
--> 572 return p.exists()
File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\mixins\files.py:490, in _DbfsPath.exists(self)
488 def exists(self) -> bool:
489 try:
--> 490 self._api.get_status(self.as_string)
491 return True
492 except NotFound:
File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\service\files.py:624, in DbfsAPI.get_status(self, path)
621 if path is not None: query['path'] = path
622 headers = {'Accept': 'application/json', }
--> 624 res = self._api.do('GET', '/api/2.0/dbfs/get-status', query=query, headers=headers)
625 return FileInfo.from_dict(res)
File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\core.py:132, in ApiClient.do(self, method, path, query, headers, body, raw, files, data, response_headers)
128 headers['User-Agent'] = self._user_agent_base
129 retryable = retried(timeout=timedelta(seconds=self._retry_timeout_seconds),
130 is_retryable=self._is_retryable,
131 clock=self._cfg.clock)
--> 132 response = retryable(self._perform)(method,
133 path,
134 query=query,
135 headers=headers,
136 body=body,
137 raw=raw,
138 files=files,
139 data=data)
141 resp = dict()
142 for header in response_headers if response_headers else []:
File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\retries.py:54, in retried..decorator..wrapper(*args, **kwargs)
50 retry_reason = f'{type(err).name} is allowed to retry'
52 if retry_reason is None:
53 # raise if exception is not retryable
---> 54 raise err
56 logger.debug(f'Retrying: {retry_reason} (sleeping ~{sleep}s)')
57 clock.sleep(sleep + random())
File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\retries.py:33, in retried..decorator..wrapper(*args, **kwargs)
31 while clock.time() < deadline:
32 try:
---> 33 return func(*args, **kwargs)
34 except Exception as err:
35 last_err = err
File C:\ProgramData\anaconda3\envs\databricks2\lib\site-packages\databricks\sdk\core.py:243, in ApiClient._perform(self, method, path, query, headers, body, raw, files, data)
239 if not response.ok: # internally calls response.raise_for_status()
240 # TODO: experiment with traceback pruning for better readability
241 # See https://stackoverflow.com/a/58821552/277035
242 payload = response.json()
--> 243 raise self._make_nicer_error(response=response, **payload) from None
244 # Private link failures happen via a redirect to the login page. From a requests-perspective, the request
245 # is successful, but the response is not what we expect. We need to handle this case separately.
246 if _is_private_link_redirect(response):
InvalidParameterValue: Path must be absolute: \mnt'
Other Information
Additional context
If I manually add the following row in Lib\site-packages\databricks\sdk\core.py , the code returns correct 'true' value:
def do(self, method: str, path: str, query: dict = None, headers: dict = None, body: dict = None, raw: bool = False, files=None, data=None, response_headers: List[str] = None) -> Union[dict, BinaryIO]: logger.warning(f"AAA path:{path}; method:{method}; query:{query}; headers:{headers}; body:{body}; raw:{raw}; files:{files}; data:{data}; response_headers:{response_headers}") if 'path' in query: query['path'] = query['path'].replace('\\', '/')
The text was updated successfully, but these errors were encountered: