fpetersen-gl opened a new pull request, #13565:
URL: https://github.com/apache/iceberg/pull/13565

   Closes #13508
   
   This PR tries to avoid data loss when it comes to writing parquet-files over 
an unstable network.
   
   In the ParquetWriter, the internal flag `closed` is set before all logic has 
been performed successfully. If a wrapped writer is later calling `close()` 
again, this writer doesn't do anything, as it considers itself closed already. 
This leads to a missing parquet-file, which is reported back as successfully 
written.
   
   The accompanying test tries to mimic the behaviour of an unstable network 
being used by e.g. an S3-client by first keeping all data in memory before 
writing it to a file when `close()` is called on the output. Depending on a 
flag, this fails or succeeds. The various bits of the test have been 
copy-pasted from already existing test-code, as they can't be re-used due to 
their too strict encapsulation.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to