jiakai-li commented on code in PR #1453:
URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1903135543


##########
pyiceberg/io/pyarrow.py:
##########
@@ -352,7 +352,7 @@ def parse_location(location: str) -> Tuple[str, str, str]:
 
     def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) -> 
FileSystem:
         if scheme in {"s3", "s3a", "s3n", "oss"}:
-            from pyarrow.fs import S3FileSystem
+            from pyarrow.fs import S3FileSystem, resolve_s3_region

Review Comment:
   > I dont think its supported, the underlying call is looking for 
`x-amz-bucket-region` which i dont think aliyun will set
   
   Thank you for checking that, I should have looked at it :-)
   
   > since we're using both scheme and bucket to cache FS, this should be fine 
right? For the case of `oss://bucket` and `s3://bucket`
   
   Yes, there is no issue after the change now. What I was thinking is for the 
`oss://bucket` scenario (ignore the caching behavior). If the bucket used by 
`oss` also exists in AWS, then the previous version (before your comment) would 
try to resolve the bucket and use it to overwrite the defaul setting. This will 
be wrong though, because `oss` bucket region cannot be resolved using pyarrow.
   
   I updated the test case to take this into account and also added an 
integration test for multiple filesystem read.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to