danielcweeks commented on code in PR #11052: URL: https://github.com/apache/iceberg/pull/11052#discussion_r1777363011
########## aws/src/main/java/org/apache/iceberg/aws/s3/S3FileIOProperties.java: ########## @@ -393,6 +403,21 @@ public class S3FileIOProperties implements Serializable { */ private static final String S3_FILE_IO_USER_AGENT = "s3fileio/" + EnvironmentContext.get(); + /** Number of times to retry S3 operations. */ + public static final String S3_RETRY_NUM_RETRIES = "s3.retry.num-retries"; + + public static final int S3_RETRY_NUM_RETRIES_DEFAULT = 32; Review Comment: I would agree with the concern here. If we can narrow this retry behavior to just the 503 error code, I would be more amenable (though 32 is still very high), but as a default retry this causes really bad behaviors for other errors. The other issue is that there are retries in the surrounding execution path (task retries and stage retries in spark for example). The compounding effect of high retry values has a multiplicative effect for the overall job. I would also say that trying to hide this is problematic for those diagnosing slowness with their workloads, so we should log messages when slowdowns are occurring. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org