marty-sullivan opened a new issue, #7566:
URL: https://github.com/apache/iceberg/issues/7566
### Apache Iceberg version
None
### Query engine
None
### Please describe the bug 🐞
I'm trying out PyIceberg v0.3.0 installed via conda-forge
I'm trying out some queries to an AWS Glue schema. The following runs fine
and returns the expected catalog items from glue:
```python
from pyiceberg.catalog import load_catalog
catalog = load_catalog("glue")
print(catalog.list_tables('my_database'))
```
However when I run:
```python
channels = catalog.load_table('my_database.iceberg_test')
```
I get the following error:
```
Traceback (most recent call last):
File "/mnt/persistent/composite-generators/test/iceberg/intro.py", line 8,
in <module>
channels = catalog.load_table('my_database.iceberg_test')
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/ec2-user/micromamba/lib/python3.11/site-packages/pyiceberg/catalog/glue.py",
line 278, in load_table
return
self._convert_glue_to_iceberg(load_table_response.get(PROP_GLUE_TABLE, {}))
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/ec2-user/micromamba/lib/python3.11/site-packages/pyiceberg/catalog/glue.py",
line 179, in _convert_glue_to_iceberg
file = io.new_input(metadata_location)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/ec2-user/micromamba/lib/python3.11/site-packages/pyiceberg/io/fsspec.py",
line 243, in new_input
fs = self.get_fs(uri.scheme)
^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/ec2-user/micromamba/lib/python3.11/site-packages/pyiceberg/io/fsspec.py",
line 280, in _get_fs
return self._scheme_to_fs[scheme](self.properties)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/ec2-user/micromamba/lib/python3.11/site-packages/pyiceberg/io/fsspec.py",
line 104, in _s3
fs = S3FileSystem(client_kwargs=client_kwargs,
config_kwargs=config_kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/ec2-user/micromamba/lib/python3.11/site-packages/fsspec/spec.py", line
76, in __call__
obj = super().__call__(*args, **kwargs)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File
"/home/ec2-user/micromamba/lib/python3.11/site-packages/s3fs/core.py", line
187, in __init__
self.s3 = self.connect()
^^^^^^^^^^^^^^
File
"/home/ec2-user/micromamba/lib/python3.11/site-packages/s3fs/core.py", line
292, in connect
self.s3 = self.session.create_client('s3', aws_access_key_id=self.key,
aws_secret_access_key=self.secret, aws_session_token=self.token, config=conf,
use_ssl=ssl,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
TypeError: botocore.session.Session.create_client() got multiple values for
keyword argument 'aws_access_key_id'
```
It seems odd that I would get a credential error here, given that the
previously mentioned code successfully queries my glue data catalog
successfully.
At first, I thought it was an issue with my AWS profile, since it is using
the newer `aws configure sso` profile through AWS Identity Center which isn't
always supported by everything out there yet.
However, I tried explicitly setting `AWS_ACCESS_KEY_ID`,
`AWS_SECRET_ACCESS_KEY`, and `AWS_SESSION_TOKEN` explicitly in my environment
but I get the same exact error.
I'm not sure what my problem is, but it's confusing that my AWS credential
profiles & environment variables would work for some calls but not others. I
will also be very clear, I have no AWS credential problems with any other item
at the moment, just this test I was doing with PyIceberg.
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]