javrasya commented on issue #9444:
URL: https://github.com/apache/iceberg/issues/9444#issuecomment-2005133969

   The code I share earlier @shanzi caused some data loss. It was not closing 
the currently open stream, so be careful with that. I am sorry see that my 
buggy code above is spreading. I have a safe version to use and I got the 
inspiration (copied/pasted) from the PR @amogh-jahagirdar mentioned. Here is 
the custom S3InputFile I have been using with my Flink and Spark projects;
   
   I have tested it with data at scale and my data loss problem went away 100% 
and I have not been getting that socket closed exception anymore. 
   https://gist.github.com/javrasya/76ad0267399e379f5801a6d75c09882a
   
   
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to