jiakai-li commented on PR #1453: URL: https://github.com/apache/iceberg-python/pull/1453#issuecomment-2557425441
Thank you @kevinjqliu , just try to clear my head a little bit > I think a potential solution might be to omit the "region" property and allow the S3FileSystem to determine the proper region using resolve_s3_region. This is recommended in the [S3FileSystem docs](https://arrow.apache.org/docs/python/generated/pyarrow.fs.S3FileSystem.html) for region. Is the change I made in accordance with this option? What I've done essentially is using the `netloc` to determine the bucket region. Only in case when, for some reason, the region cannot be determined then we fall back to the `properties` configuration. > Another potential issue is the way we cache fs, it assumes that there's only one fs per scheme. With the region approach above, we break this assumption. Please correct me if I miss something for how the fs cache works. But here is my understanding: I see we use `lru_cache`, so it should cache one fs for each different bucket since they will have different `netloc` and thus a different key in the cache. Previously, it looks like we only have one cached fs. It seems relates to the `netloc` not being used. As a result, `netloc` is not connected with the `client_kwargs["region"]` configuration. In this case, even two cache keys point to two fs instances, the two fs instances are still of the same region (the one configured in `properties`). I think solving the `netloc` issue will also resolve the cache issue as the `lru_cache` key now links with the region and will return the correct instance. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org