gmweaver commented on issue #19: URL: https://github.com/apache/iceberg-python/issues/19#issuecomment-2643719088
I ran into the same issue as @cee-shubham and my initial guess is that S3 bucket authentication is not using the S3 keys configured on the server and is instead trying to use local S3 creds/keys? I confirmed that the S3 keys I have on my Nessie server have access to the bucket by running a similar append operation via Spark. Example code with output: ``` >>> catalog = load_catalog( ... "nessie", ... **{ ... "uri": "http://nessie:19120/iceberg/main", ... }, ... ) >>> >>> print(catalog.list_namespaces()) [('demo',)] >>> schema = Schema( ... NestedField(1, "id", IntegerType(), required=True), ... NestedField(2, "name", StringType(), required=False), ... ) >>> >>> catalog.create_table("demo.test_pyiceberg_table", schema) test_pyiceberg_table( 1: id: required int, 2: name: optional string ), partition by: [], sort order: [], snapshot: null >>> table = catalog.load_table("demo.test_pyiceberg_table") >>> table.scan().to_pandas() Empty DataFrame Columns: [id, name] Index: [] >>> data = pa.Table.from_pydict( ... { ... "id": np.array([1, 2, 3], dtype="int32"), ... "name": ["Alice", "Bob", "Charlie"], ... }, ... schema=schema.as_arrow(), ... ) >>> >>> table.append(data) Traceback (most recent call last): File "/Users/garrett.weaver/Library/Caches/pypoetry/virtualenvs/testing-py3.12/lib/python3.12/site-packages/s3fs/core.py", line 114, in _error_wrapper return await func(*args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/Users/garrett.weaver/Library/Caches/pypoetry/virtualenvs/testing-py3.12/lib/python3.12/site-packages/aiobotocore/client.py", line 412, in _make_api_call raise error_class(parsed_response, operation_name) botocore.exceptions.ClientError: An error occurred (403) when calling the PutObject operation: Forbidden ``` Similar code on Spark works: ``` SPARK_PACKAGES = [ "org.apache.iceberg:iceberg-spark-runtime-3.4_2.12:1.7.1", "org.projectnessie.nessie-integrations:nessie-spark-extensions-3.4_2.12:0.99.0", "software.amazon.awssdk:bundle:2.20.126", "software.amazon.awssdk:url-connection-client:2.20.126", ] SPARK_SQL_EXTENSIONS = [ "org.apache.iceberg.spark.extensions.IcebergSparkSessionExtensions", "org.projectnessie.spark.extensions.NessieSparkSessionExtensions", ] SPARK_CONFIG = { "spark.jars.packages": ",".join(SPARK_PACKAGES), "spark.sql.extensions": ",".join(SPARK_SQL_EXTENSIONS), "spark.sql.catalog.nessie": "org.apache.iceberg.spark.SparkCatalog", "spark.sql.catalog.nessie.uri": "http://nessie:19120/iceberg/main", "spark.sql.catalog.nessie.type": "rest", } spark = SparkSession.builder.config(map=SPARK_CONFIG).getOrCreate() >>> spark.sql( ... """ ... CREATE TABLE IF NOT EXISTS nessie.demo.test_spark_table ( ... id INTEGER, ... name STRING ... ) USING iceberg ... """ ... ) DataFrame[] >>> >>> spark.sql( ... """ ... INSERT INTO nessie.demo.test_spark_table VALUES ... (1, 'Alice'), ... (2, 'Bob') ... """ ... ).show() ++ || ++ ++ >>> spark.read.format("iceberg").load("nessie.demo.test_spark_table").show() +---+-----+ | id| name| +---+-----+ | 1|Alice| | 2| Bob| +---+-----+ ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org