nastra commented on code in PR #10233: URL: https://github.com/apache/iceberg/pull/10233#discussion_r2001502807
########## core/src/main/java/org/apache/iceberg/hadoop/HadoopFileIO.java: ########## @@ -173,26 +203,223 @@ public void deletePrefix(String prefix) { } } + /** + * Initialize the wrapped IO class if configured to do so. + * + * @return true if bulk delete should be used. + */ + private synchronized boolean maybeUseBulkDeleteApi() { + if (!bulkDeleteConfigured.compareAndSet(false, true)) { + // configured already, so return. + return useBulkDelete; + } + boolean enableBulkDelete = conf().getBoolean(BULK_DELETE_ENABLED, BULK_DELETE_ENABLED_DEFAULT); + if (!enableBulkDelete) { + LOG.debug("Bulk delete is disabled"); + useBulkDelete = false; + } else { + // library is configured to use bulk delete, so try to load it + // and probe for the bulk delete methods being found. + // this is only satisfied on Hadoop releases with the WrappedIO class. + wrappedIO = new DynamicWrappedIO(getClass().getClassLoader()); Review Comment: I guess I don't fully follow why we need to load all of this stuff dynamically. Iceberg is on Hadoop 3.4.1, so we should be able to use the Bulk Delete API of Hadoop directly? Additionally I'm not convinced that we would need to have a `BULK_DELETE_ENABLED` property. We would use bulk deletion by default and fallback in case it's not available -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org For additional commands, e-mail: issues-h...@iceberg.apache.org