Vitalii0-o commented on issue #606: URL: https://github.com/apache/iceberg-python/issues/606#issuecomment-2058724052
Traceback (most recent call last): File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/destinations/sql_client.py", line 242, in _wrap_gen return (yield from f(self, *args, **kwargs)) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/destinations/impl/athena/athena.py", line 296, in execute_query cursor.execute(query_line, db_args) File "/usr/local/airflow/.local/lib/python3.11/site-packages/pyathena/cursor.py", line 108, in execute raise OperationalError(query_execution.state_change_reason) pyathena.error.OperationalError: GENERIC_INTERNAL_ERROR: io.trino.hdfs.s3.TrinoS3FileSystem$UnrecoverableS3OperationException: com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: 123; S3 Extended Request ID: 123=; Proxy: null), S3 Extended Request ID: 123= (Bucket: bucket, Key: facebook/123/bronze_facebook_test1/_dlt_pipeline_state/metadata/123.metadata.json) During handling of the above exception, another exception occurred: Traceback (most recent call last): File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/pipeline/pipeline.py", line 699, in sync_destination remote_state = self._restore_state_from_destination() ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/pipeline/pipeline.py", line 1420, in _restore_state_from_destination state = load_pipeline_state_from_destination(self.pipeline_name, job_client) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/pipeline/state_sync.py", line 139, in load_pipeline_state_from_destination state = client.get_stored_state(pipeline_name) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/destinations/job_client_impl.py", line 368, in get_stored_state with self.sql_client.execute_query(query, pipeline_name) as cur: File "/usr/local/lib/python3.11/contextlib.py", line 137, in __enter__ return next(self.gen) ^^^^^^^^^^^^^^ File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/destinations/sql_client.py", line 244, in _wrap_gen raise self._make_database_exception(ex) dlt.destinations.exceptions.DatabaseTerminalException: GENERIC_INTERNAL_ERROR: io.trino.hdfs.s3.TrinoS3FileSystem$UnrecoverableS3OperationException: com.amazonaws.services.s3.model.AmazonS3Exception: The specified key does not exist. (Service: Amazon S3; Status Code: 404; Error Code: NoSuchKey; Request ID: 123; S3 Extended Request ID: 123=; Proxy: null), S3 Extended Request ID: 123= (Bucket: bucket, Key: facebook/123/bronze_facebook_test1/_dlt_pipeline_state/metadata/123.metadata.json) The above exception was the direct cause of the following exception: Traceback (most recent call last): File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/models/taskinstance.py", line 433, in _execute_task result = execute_callable(context=context, **execute_callable_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/operators/python.py", line 199, in execute return_value = self.execute_callable() ^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/airflow/.local/lib/python3.11/site-packages/airflow/operators/python.py", line 216, in execute_callable return self.python_callable(*self.op_args, **self.op_kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/helpers/airflow_helper.py", line 273, in _run for attempt in self.retry_policy.copy( File "/usr/local/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 347, in __iter__ do = self.iter(retry_state=retry_state) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/airflow/.local/lib/python3.11/site-packages/tenacity/__init__.py", line 314, in iter return fut.result() ^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 449, in result return self.__get_result() ^^^^^^^^^^^^^^^^^^^ File "/usr/local/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result raise self._exception File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/helpers/airflow_helper.py", line 283, in _run load_info = task_pipeline.run( ^^^^^^^^^^^^^^^^^^ File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/pipeline/pipeline.py", line 219, in _wrap step_info = f(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/pipeline/pipeline.py", line 264, in _wrap return f(self, *args, **kwargs) ^^^^^^^^^^^^^^^^^^^^^^^^ File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/pipeline/pipeline.py", line 640, in run self.sync_destination(destination, staging, dataset_name) File "/usr/local/airflow/.local/lib/python3.11/site-packages/dlt/pipeline/pipeline.py", line 173, in _wrap rv = f(self, *args, **kwargs) app is try to find --Bucket: bucket, Key: facebook/123/bronze_facebook_test1/_dlt_pipeline_state/metadata/123.metadata.json but. i have Bucket: bucket, Key: facebook/123/bronze_facebook_test1/_dlt_pipeline_state/metadata/123.metadata.json I am using dlt which uses pyAthena, which uses pyIceberg. pyIceberg create folder in s3 with extra /. I found exactly the same error in Iceberg -- https://github.com/apache/iceberg/issues/4582 -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org