javrasya commented on issue #9444: URL: https://github.com/apache/iceberg/issues/9444#issuecomment-1894309851
No idea tbh. It is very hard to address. This does not even happen when I run it on a standalone Flink cluster running on my local. This happens when my app runs on AWS Managed Flink service on production. Maybe that is due to some limitation with network or something funky going on. I tried many thing; tried tweaking apache http client settings like, socket timeout, tcp keep alive, max number of connection and also tried urlconnection http client instead of Apache http client, non really helped. I saw some other people facing the same issue outside of Flink, Iceberg and none of the remedies they came up with helped me except retrying. But for that I needed to modify the bit running in the iceberg-flink library. But regardless, it feels like retrying when that happens certain times (in my code I hardcoded that to be 3 for example) brings no harm anyway in the Iceberg flink source code. It sounds like a generic feature even. Because it can also be an intermittent network issue for anyone using Iceberg with Flink and failing the entire stream for that sounds a bit harsh. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org