jaidisido opened a new issue, #41365: URL: https://github.com/apache/arrow/issues/41365
### Describe the bug, including details regarding any error messages, version, and platform. Pyarrow fs incorrectly resolves valid S3 URIs with a whitespace as a local path: ```python from pyarrow.fs import _resolve_filesystem_and_path, FileSystem uri = "s3://bucket/prefix with space/a=a" resolved_filesystem, resolved_path = _resolve_filesystem_and_path(uri, None) resolved_filesystem <pyarrow._fs.LocalFileSystem at 0x10316ff30> ``` This causes subsequent calls such as getting the file info to fail: ```python path_info = resolved_filesystem.get_file_info(resolved_path) pyarrow.lib.ArrowInvalid: Expected a local filesystem path, got a URI... ``` A quick look into the [method](https://github.com/apache/arrow/blob/main/python/pyarrow/fs.py#L165) indicates that a LocalFilesytem is chosen by default and returned if alternative filesystems are not detected which seems like a dubious strategy... I assume this is [where](https://github.com/apache/arrow/blob/main/python/pyarrow/fs.py#L179) the S3 filesystem should be detected but a URI containing a whitespace seems to throw an exception although it's valid: ```python filesystem, path = FileSystem.from_uri(uri) Cannot parse URI: 's3://bucket/prefix with space/a=a/' ``` ### Component(s) Python -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
