andygrove commented on PR #1600:
URL: 
https://github.com/apache/datafusion-ballista/pull/1600#issuecomment-4331681814

   Comet run for comparison (same workstation)
   
   ```
   $ ./target/release/shuffle_bench  --input 
/mnt/bigdata/tpch/sf100/lineitem.parquet/ --partitioning hash   --partitions 
200 --hash-columns 0,3   --memory-limit 8589934592   --limit 10000000 --warmup 
1 --iterations 1
   === Shuffle Benchmark ===
   Input:          /mnt/bigdata/tpch/sf100/lineitem.parquet/
   Schema:         16 columns (3xdate, 4xdecimal, 4xint, 5xstring)
   Total rows:     10,000,000
   Batch size:     8,192
   Partitioning:   hash
   Partitions:     200
   Codec:          Lz4Frame
   Hash columns:   [0, 3]
   Memory limit:   8.00 GiB
   Iterations:     1 (warmup: 1)
   
     [warmup 1/1] write: 5.310s  output: 660.89 MiB
     [iter 1/1] write: 5.479s  output: 660.90 MiB
   
   === Results ===
   Write:
     avg time:         5.479s
     throughput:       1,825,306 rows/s (total across 1 tasks)
     output size:      660.90 MiB
   
   Input Metrics (last iteration):
     elapsed_compute: 0.000s
     output_rows: 20,534,912
     output_bytes: 6.43 GiB
     time_elapsed_scanning_total: 10.473s
     output_batches: 2,507
     metadata_load_time: 0.002s
     page_index_eval_time: 0.000s
     bytes_scanned: 1.22 GiB
     time_elapsed_scanning_until_data: 4.651s
     time_elapsed_opening: 0.032s
     time_elapsed_processing: 3.020s
     bloom_filter_eval_time: 0.000s
     row_pushdown_eval_time: 0.000s
     statistics_eval_time: 0.000s
   
   Shuffle Metrics (last iteration):
     input batches:    1,221
     repart time:      0.112s (2.0%)
     encode time:      2.091s (38.2%)
     write time:       0.278s (5.1%)
     data size:        3.14 GiB
   
   ```


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to