timoha commented on issue #11997: URL: https://github.com/apache/iceberg/issues/11997#issuecomment-2641847579
Running into the same problem when trying to access an iceberg table in `us-east-1` bucket from `us-east-2` account. We are using Glue for our catalog which also lives in `us-east-2`. The same `The bucket you are attempting to access must be addressed using the specified endpoint. Please send all future requests to this endpoint.` error is thrown on `pyspark.sql.SparkSession.read.format("iceberg").load(<table name in glue>)` call. We've tried the following: 1. Updating glue catalog table's metadata_location to https:// with region. Got `TABLE_OR_VIEW_NOT_FOUND`. 2. Setting `"spark.hadoop.fs.s3a.endpoint.region"` and `"fs.s3a.endpoint.region"` to `"us-east-1"`. Same error, the option doesn't seem to be taken into account. 3. Setting "spark.hadoop.fs.s3a.endpoint" to `https://` with region. Still same error. 4. Setting AWS_REGION=us-east-1 env variable. Probably would have worked if we had our Glue catalog provisioned in us-east-1 (not ideal workaround). From other tooling that we tried: - clickhouse works on `https://fgrovep-snowflake-externalvol-us-east-1-prod.s3.us-east-1.amazonaws.com/`, so no problem - AWS Athena resolves it automatically - pyiceberg hits this case as well, but mitigated by setting `s3.region` and `region_name` property on `load_file_io()` call -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org