kevinjqliu commented on code in PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1902287419
########## pyiceberg/io/pyarrow.py: ########## @@ -352,7 +352,7 @@ def parse_location(location: str) -> Tuple[str, str, str]: def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) -> FileSystem: if scheme in {"s3", "s3a", "s3n", "oss"}: - from pyarrow.fs import S3FileSystem + from pyarrow.fs import S3FileSystem, resolve_s3_region Review Comment: > I didn't find too much information regarding support of oss regions by pyarrow.fs.resolve_s3_region. But I tried it on my end and it doens't seem to work as it throws me an error complaining the bucket cannot be found. i dont think its supported, the underlying call is looking for `x-amz-bucket-region` which i dont think aliyun will set https://github.com/apache/arrow/blob/48d5151b87f1b8f977344c7ac20cb0810e46f733/cpp/src/arrow/filesystem/s3fs.cc#L660 > This could be a problem though, especially if the same bucket name is used by both Aliyun and AWS. In which case the user-provided bucket region for Aliyun could be wrongly overwritten (by the resolved AWS one). since we're using both scheme and bucket to cache FS, this should be fine right? For the case of `oss://bucket` and `s3://bucket`. > I separate the oss path from s3 for now as I'm not sure if we want to tackle on the oss now (and I feel we probably want to treat the two protocol differently?). I also break the _initialize_fs code chunk into smaller blocks to make it a bit easier for future modification. yea lets just deal with s3 for now. Btw [fsspec](https://github.com/apache/iceberg-python/blob/acd6f5a8a19db709e835e2686b87d4db3dca254f/pyiceberg/io/fsspec.py#L129-L242) splits construct per fs, i think it looks pretty clean. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org