muniatl opened a new issue, #10709: URL: https://github.com/apache/iceberg/issues/10709
### Query engine _No response_ ### Question I have a piece of code which is working with S3 endpoint and a Sql Catalog with sqlite. However for testing, I want to able to run it against a minio deployment that's hosted and running on localhost. I have tried various options with no luck. What are the parameters I need to pass to SqlCatalog and create_table? My code looks like this: we moved the environment from Singapore to east-us and setup vpc such that the traffic between EC2 and S3 went over private network. Found no performance difference with even single write( hovering around 0.8 second) Tried configuring S3 Express One Zone, but couldn't get it to work as pyIceberg uses pyArrow and pyArrow currently seems to have compatibility issue. Posted on Iceberg, pyIceberg and PyArrow forums Tried direct large file write to S3 using boto3- got a response time of about 0.2 seconds( in the range that Yatin heard from AWS folks) Tried direct write of many small files to S3 using boto3 - was in the same range of about 0.2 to 0.3 seconds When accessing S3 from within an EC2 according to Rahul and some external documents there isn't a need to pass access keys, session token and secret explicitly, but pyIceberg doesn't seem to be picking from environment when I omit them. This could be some config problem. What was odd is that access errors happen intermittently after a session key is fetched a while ago. The error goes away when I replace with new key. Need to research a little more about IAM settings and pyIceberg with Rahul's help postgresql+psycopg2://postgres:ph1@localhost:5433/template1 MINIO_ROOT_USER=minio-user MINIO_ROOT_PASSWORD=minio-user MINIO_VOLUMES="/mnt/minio" catalog = SqlCatalog( "default", **{ "uri": f"sqlite:///{warehouse_path}/pyiceberg_catalog.db", #"uri" : f"postgresql+psycopg2://postgres:ph1@localhost:5433/template1", "warehouse": "s3://127.0.0.1:9000/iceberg", # have tried "s3://iceberg" "s3://127.0.0.1/iceberg" and completely commenting out warehouse "s3.endpoint" : "s3://127.0.0.1:9000", #"minio-root-user": "admin", #"minio-root-password": "password", #"minio-domain" : "minio", #"s3.access-key-id": "admin", #"s3.secret-access-key": "password", }, ) table = catalog.create_table( "default1.taxi_dataset", schema=df.schema, ) OSError: When getting information for key 'iceberg/default1.db/taxi_dataset/metadata/00000-671ce9cf-73ff-49a2-a22e-408d8758625b.metadata.json' in bucket '127.0.0.1:9000': AWS Error NETWORK_CONNECTION during HeadObject operation: curlCode: 6, Couldn't resolve host name. I am able to access minio server, login and able to even upload files. Any pointers on what are the valid properties to pass for minio much appreciated -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org