abmo-x opened a new pull request, #9998:
URL: https://github.com/apache/iceberg/pull/9998

   This is fix for https://github.com/apache/iceberg/issues/9997
   
   ### Root cause
   
   s3a putObject was interrupted due to flink pipeline failure. As this 
interrupt is not handled and thrown as an exception, the metadata writer 
assumes write was successful which results in table pointing to a metadata json 
file that doesn't exist.
   
   #### Hadoop S3A code
   
https://github.com/apache/hadoop/blob/0f51d2a4ec17bad754beb17048409811a151be53/hadoop-tools/hadoop-aws/src/main/java/org/apache/hadoop/fs/s3a/S3ABlockOutputStream.java#L636
   
   ``` 
   try {
         putObjectResult.get();
         return size;
       } catch (InterruptedException ie) {
         LOG.warn("Interrupted object upload", ie);
         Thread.currentThread().interrupt();
         return 0;
       } catch (ExecutionException ee) {
         throw extractException("regular upload", key, ee);
       }
   ```
   
   
   #### Iceberg commit successful on upload interrupt
   
   `
   2024-03-06T23:06:36.160+00:00 INFO 
org.apache.iceberg.hive.HiveTableOperations  Committed to table 
iceberg.default.some_table with the new metadata location 
s3a://bucket/prefix/default.db/some_table/metadata/52949-1e28478a-bf5e-4ec0-8d2c-......metadata.json
 `
   
   But request interrupted 
   `2024-03-06T23:06:36.024+00:00 WARN  
org.apache.hadoop.fs.s3a.S3ABlockOutputStream  Interrupted object upload`


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to