osy497 opened a new issue, #10750:
URL: https://github.com/apache/iceberg/issues/10750

   ### Query engine
   
   JAVA API
   
   ### Question
   
   We have been trying to store our data into Iceberg table with version 
`1.5.2` of Iceberg.
   
   Now, we are using `Rest catalog` and `Parquet` as data format, and the 
related code to flush the writer is following logic:
   
   ```java
   AppendFiles appendFiles = table.newAppend();
   DataFile[] dataFiles = writer.dataFiles();
   for (var dataFile : dataFiles) {
        appendFiles.appendFile(dataFile);
   }
   appendFiles.commit();
   ```
   
   The above flush code works fine for the most case, but the `dataFiles()` 
code sometimes fails with an exception due to a timeout or something.
   
   When this happens, we are currently writing the entire data into writer 
again and flushing it again, which I think is a huge overhead.
   
   To avoid this, we would like to add retry logic to the dataFiles if the 
`dataFiles()` method is retryable. 
   
   For example, if in `dataFiles()`, part of the data in the writer buffer 
succeeds and part fails, will there be a problem with retrying?
   
   Your answer would be appreciated.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to