jiakai-li commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1903135543
########## pyiceberg/io/pyarrow.py: ########## @@ -352,7 +352,7 @@ def parse_location(location: str) -> Tuple[str, str, str]: def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) -> FileSystem: if scheme in {"s3", "s3a", "s3n", "oss"}: - from pyarrow.fs import S3FileSystem + from pyarrow.fs import S3FileSystem, resolve_s3_region Review Comment: > I dont think its supported, the underlying call is looking for `x-amz-bucket-region` which i dont think aliyun will set Thank you for checking that, I should have looked at it :-) > since we're using both scheme and bucket to cache FS, this should be fine right? For the case of `oss://bucket` and `s3://bucket` Yes, there is no issue after the change now. What I was thinking is for the `oss://bucket` scenario (ignore the caching behavior). If the bucket used by `oss` also exists in AWS, then the previous version (before your comment) would try to resolve the bucket and use it to overwrite the defaul setting. This will be wrong though, because `oss` bucket region cannot be resolved using pyarrow. I updated the test case to take this into account and also added an integration test for multiple filesystem read. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org