arookieds opened a new issue, #515: URL: https://github.com/apache/iceberg-python/issues/515
### Question **PyIceberg version**: 0.6.0 **Python version**: 3.11.1 Comments: - Iceberg tables are saved in a AWS Glue catalog - catalog, list of namespaces and list of tables are retrievable through the catalog api Hi, I am facing issues loading iceberg tables from AWS Glue. The code I am using is as follow: ``` from opensea.resources.resources import * import pyiceberg.catalog profile_name = "saml2aws_profile_name" catalog_name = "catalog name" table_name = "table name" aws_region = "aws region" catalog = pyiceberg.catalog.load_catalog( catalog_name, **{"type": "glue", "profile_name": profile_name} ) print(catalog.list_namespaces()) table = catalog.load_table((catalog_name, table_name)) ``` The code allow me to: - list namespaces - list tables But **load_table** throw the following error: ``` Traceback (most recent call last): File "/path/to/the/project/testing.py", line 15, in <module> table = catalog.load_table((catalog_name, table_name)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/path/to/the/project/venv/lib/python3.11/site-packages/pyiceberg/catalog/glue.py", line 473, in load_table return self._convert_glue_to_iceberg(self._get_glue_table(database_name=database_name, table_name=table_name)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/path/to/the/project/venv/lib/python3.11/site-packages/pyiceberg/catalog/glue.py", line 296, in _convert_glue_to_iceberg metadata = FromInputFile.table_metadata(file) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/path/to/the/project/venv/lib/python3.11/site-packages/pyiceberg/serializers.py", line 112, in table_metadata with input_file.open() as input_stream: ^^^^^^^^^^^^^^^^^ File "/path/to/the/project/venv/lib/python3.11/site-packages/pyiceberg/io/pyarrow.py", line 263, in open input_file = self._filesystem.open_input_file(self._path) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "pyarrow/_fs.pyx", line 780, in pyarrow._fs.FileSystem.open_input_file File "pyarrow/error.pxi", line 154, in pyarrow.lib.pyarrow_internal_check_status File "pyarrow/error.pxi", line 91, in pyarrow.lib.check_status OSError: When reading information for key 'path/to/s3/table/location/metadata/100000-458c8ffc-de06-4eb5-bc4a-b94c3034a548.metadata.json' in bucket 's3_bucket_name': AWS Error UNKNOWN (HTTP status 400) during HeadObject operation: No response body. ``` I have checked I have the proper accesses, but it wasn't the issue. I have tried a few other things but they were all unsuccessful. - using _load_glue_, instead of _load_catalog_ - providing access_key and secret_key directly in the load_catalog call The table definition is as follow and was created via Trino: ``` create table catalog_name.table_name ( "timestamp" timestamp, "type" varchar(20), distribution int, service int, code varchar(20), base_id bigint, counter_id bigint, "category" varchar(50), volume double) with ( format = 'PARQUET', partitioning = ARRAY['day(timestamp)'], location = 's3://s3_bucket/path/to/table/folder/' ) ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org