Re: [PR] S3: Disable strong integrity checksums [iceberg]

via GitHub Thu, 13 Mar 2025 05:48:33 -0700


steveloughran commented on PR #12264:
URL: https://github.com/apache/iceberg/pull/12264#issuecomment-2721144184


   @mmgaggle I'm actually setting up the s3a tests to actually test through 
iceberg and parquet, so we can validate features and performance optimisations 
through our code. 
   
   Initially, https://github.com/apache/hadoop/pull/7316 has gone in for the 
bulk delete API of #10233; (please can someone review/merge this!)... it will 
then act as a regression test of the s3a connector, as well as being easy test 
local iceberg/parquet builds against arbitrary stores through our test harness. 
That test harness uses the hadoop IOStatistics API to make assertions about the 
actual number of remote S3 calls made -this lets you identify regressions in 
the actually amount of network IO which takes place. Everyone cares about this.
   
   Even with this, you should have a test harness which
   * can be targeted at production S3 stores
   * contains a good set of operations, both low level FileIO and higher level 
API calls
   * has many of those tests abstracted up to work with all FileIO 
implementation.
   * provides really good diagnostics on test failures. 
   
   If someone starts that, I'd be happy to help. What i'm not going to is say 
"here are the tests you need". I did try to do that with spark and the 
spark-hadoop-cloud module, but there was no interest in full integration tests. 
I'd only do it for iceberg as part of a collaborative work with others.
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Re: [PR] S3: Disable strong integrity checksums [iceberg]

Reply via email to