timoha commented on issue #11997:
URL: https://github.com/apache/iceberg/issues/11997#issuecomment-2641847579

   Running into the same problem when trying to access an iceberg table in 
`us-east-1` bucket from `us-east-2` account. We are using Glue for our catalog 
which also lives in `us-east-2`. The same `The bucket you are attempting to 
access must be addressed using the specified endpoint. Please send all future 
requests to this endpoint.` error is thrown on 
`pyspark.sql.SparkSession.read.format("iceberg").load(<table name in glue>)` 
call.
   
   We've tried the following:
   1. Updating glue catalog table's metadata_location to https:// with region. 
Got `TABLE_OR_VIEW_NOT_FOUND`.
   2. Setting `"spark.hadoop.fs.s3a.endpoint.region"` and 
`"fs.s3a.endpoint.region"` to `"us-east-1"`. Same error, the option doesn't 
seem to be taken into account.
   3. Setting "spark.hadoop.fs.s3a.endpoint" to `https://` with region. Still 
same error.
   4. Setting AWS_REGION=us-east-1 env variable. Probably would have worked if 
we had our Glue catalog provisioned in us-east-1 (not ideal workaround).
   
   From other tooling that we tried:
   - clickhouse works on 
`https://fgrovep-snowflake-externalvol-us-east-1-prod.s3.us-east-1.amazonaws.com/`,
 so no problem
   - AWS Athena resolves it automatically
   - pyiceberg hits this case as well, but mitigated by setting `s3.region` and 
`region_name` property on `load_file_io()` call
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to