sdd opened a new pull request, #497:
URL: https://github.com/apache/iceberg-rust/pull/497

   This PR adds some performance testing capabilities. It includes the 
following features:
   * docker-compose environment that includes containers for Minio, Spark, 
HAProxy and the Iceberg REST Catalog
   * Uses HAProxy to simulate real-world latency and bandwidth constraints of 
connections to services like S3
   * Includes scripting to create an Iceberg table in the performance testing 
environment and populate it with data from the widely-used NYC Taxi dataset
   * Adds a justfile for ease of creating, initialising, starting, stopping and 
tearing down the performance testing environment
   * Adds some Criterion benchmarks that use the performance testing 
environment to test the performance of `TableScan.plan_files` in four different 
representative scenarios
   
   This is still a work-in-progress - especially the support code around 
working with docker-compose. I've been using this on MacOS using OrbStack and 
so there will probably need to be some work done to ensure compatibility with 
Linux hosts / docker / podman.
   
   I see that @alexyin1 has been working on Podman support in 
https://github.com/apache/iceberg-rust/pull/489. I'll work to make sure that 
our combined efforts are aligned.
   
   Unlike the previous docker-based integration tests, at the moment the tests 
in here require the developer to manually run tasks from the justfile in order 
to setup / start / stop the docker environment. The decision to do it this way 
was because of the longer setup times due to needing to download and insert 
data. I'm open to suggestions on better approaches.
   
   TODO: I've not yet included the scripting to retrieve the source data for 
NYC Taxi yet and will add over the next couple of days. I wanted to get this PR 
in early to get some feedback.
   
   I'll use this suite to measure performance changes on the concurrent table 
scan PR as well as a couple of other read-performance-related changes that I 
have that I'm in the progress of turning into PRs.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@iceberg.apache.org
For additional commands, e-mail: issues-h...@iceberg.apache.org

Reply via email to