kevinjqliu commented on code in PR #1453:
URL: https://github.com/apache/iceberg-python/pull/1453#discussion_r1902287419


##########
pyiceberg/io/pyarrow.py:
##########
@@ -352,7 +352,7 @@ def parse_location(location: str) -> Tuple[str, str, str]:
 
     def _initialize_fs(self, scheme: str, netloc: Optional[str] = None) -> 
FileSystem:
         if scheme in {"s3", "s3a", "s3n", "oss"}:
-            from pyarrow.fs import S3FileSystem
+            from pyarrow.fs import S3FileSystem, resolve_s3_region

Review Comment:
   > I didn't find too much information regarding support of oss regions by 
pyarrow.fs.resolve_s3_region. But I tried it on my end and it doens't seem to 
work as it throws me an error complaining the bucket cannot be found.
   
   i dont think its supported, the underlying call is looking for 
`x-amz-bucket-region` which i dont think aliyun will set
   
https://github.com/apache/arrow/blob/48d5151b87f1b8f977344c7ac20cb0810e46f733/cpp/src/arrow/filesystem/s3fs.cc#L660
   
   > This could be a problem though, especially if the same bucket name is used 
by both Aliyun and AWS. In which case the user-provided bucket region for 
Aliyun could be wrongly overwritten (by the resolved AWS one).
   
   since we're using both scheme and bucket to cache FS, this should be fine 
right? For the case of `oss://bucket` and `s3://bucket`.
   
   > I separate the oss path from s3 for now as I'm not sure if we want to 
tackle on the oss now (and I feel we probably want to treat the two protocol 
differently?). I also break the _initialize_fs code chunk into smaller blocks 
to make it a bit easier for future modification.
   
   yea lets just deal with s3 for now. Btw 
[fsspec](https://github.com/apache/iceberg-python/blob/acd6f5a8a19db709e835e2686b87d4db3dca254f/pyiceberg/io/fsspec.py#L129-L242)
 splits construct per fs, i think it looks pretty clean. 



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to