dweiss commented on PR #14824:
URL: https://github.com/apache/lucene/pull/14824#issuecomment-2993560243

   Here are some benchmarks from my Linux system. 
   
   This runs the patch with varying number of workers and the default
   ```-XX:ActiveProcessorCount=1``` in gradle.properties. This is also the 
worst-case scenario of reformatting
   all files from scratch, no incremental information. My system is an Ubuntu 
AMD
   Ryzen Threadripper 3970X, 32 core.
   
   ```
   echo "With bg daemon: './gradlew clean checkGoogleJavaFormat'."
   ./gradlew -q clean 
   ./gradlew -q --stop
   for workers in 1 2 4 8 16 32; do
     echo "max-workers: $workers"
     (for i in `seq 1 3`; do time ./gradlew clean checkGoogleJavaFormat 
--max-workers $workers ; done ) 2>&1 | grep "real"
   done
   ```
   results:
   ```
   max-workers: 1
   real 0m51.099s
   real 0m42.128s
   real 0m41.845s
   max-workers: 2
   real 0m23.382s
   real 0m23.619s
   real 0m23.563s
   max-workers: 4
   real 0m15.160s
   real 0m15.877s
   real 0m15.169s
   max-workers: 8
   real 0m13.651s
   real 0m13.534s
   real 0m13.468s
   max-workers: 16
   real 0m18.783s
   real 0m18.273s
   real 0m18.813s
   max-workers: 32
   real 0m26.930s
   real 0m26.884s
   real 0m27.259s
   ```
   
   The CPU is mostly idle during most of these runs. Weird. If I remove 
```-XX:ActiveProcessorCount=1```
   from gradle.properties, I get these results:
   ```
   max-workers: 1
   real 0m43.419s
   real 0m41.613s
   real 0m41.829s
   max-workers: 2
   real 0m22.878s
   real 0m22.745s
   real 0m22.604s
   max-workers: 4
   real 0m13.625s
   real 0m13.452s
   real 0m13.608s
   max-workers: 8
   real 0m8.392s
   real 0m8.490s
   real 0m8.319s
   max-workers: 16
   real 0m6.225s
   real 0m6.929s
   real 0m6.356s
   max-workers: 32
   real 0m6.052s
   real 0m6.863s
   real 0m6.204s
   ```
   so it clearly is a benefit if you have higher core counts. It's also close 
to the lower limit
   of manually running google-java-format on all source files (they do have 
concurrent processing 
   inside).
   
   For the incremental case... it's fast enough (well, it doesn't do anything), 
even from a
   "cold" start, without any daemon in the background (the first call will show 
configuration
   time):
   ```
   echo "With bg daemon, from cold-start, incremental: './gradlew 
checkGoogleJavaFormat'."
   ./gradlew checkGoogleJavaFormat
   for workers in 2 4 8; do
     echo "max-workers: $workers"
     ./gradlew --stop -q
     (for i in `seq 1 3`; do time ./gradlew checkGoogleJavaFormat --max-workers 
$workers ; done ) 2>&1 | grep "real"
   done
   ```
   results:
   ```
   max-workers: 2
   real 0m8.953s
   real 0m2.105s
   real 0m2.002s
   max-workers: 4
   real 0m8.746s
   real 0m2.061s
   real 0m2.037s
   max-workers: 8
   real 0m8.856s
   real 0m2.054s
   real 0m1.978s
   ```
   
   Rather internal detail but shows different batch sizes of input files 
   for a constant number of workers (the default is 5):
   ```
   for batchSize in 1 2 4 8 16 32 64; do
     echo "batch size: $batchSize"
     (for i in `seq 1 3`; do time ./gradlew clean checkGoogleJavaFormat 
-Plucene.gjf.batchSize=$batchSize --max-workers 8 ; done ) 2>&1 | grep "real"
   done
   ```
   results:
   ```
   batch size: 1
   real 0m8.958s
   real 0m8.760s
   real 0m8.585s
   batch size: 2
   real 0m8.542s
   real 0m8.401s
   real 0m8.481s
   batch size: 4
   real 0m8.341s
   real 0m8.378s
   real 0m8.469s
   batch size: 8
   real 0m8.671s
   real 0m8.494s
   real 0m8.458s
   batch size: 16
   real 0m8.363s
   real 0m8.408s
   real 0m8.395s
   batch size: 32
   real 0m8.384s
   real 0m8.320s
   real 0m8.423s
   batch size: 64
   real 0m8.578s
   real 0m8.622s
   real 0m8.699s
   ```
   
   Finally, the same check for the previous, spotless-based implementation 
(main branch).
   ```
   ./gradlew --stop
   ./gradlew clean
   for workers in 1 2 4 8 16 32; do
     echo "max-workers: $workers"
     (for i in `seq 1 3`; do time ./gradlew clean spotlessJavaCheck 
--max-workers $workers ; done ) 2>&1 | grep "real"
   done
   ```
   results:
   ```
   max-workers: 1
   real 0m49.843s
   real 0m47.934s
   real 0m48.256s
   max-workers: 2
   real 0m28.170s
   real 0m27.980s
   real 0m27.851s
   max-workers: 4
   real 0m21.620s
   real 0m21.475s
   real 0m21.503s
   max-workers: 8
   real 0m21.192s
   real 0m20.962s
   real 0m20.895s
   max-workers: 16
   real 0m20.984s
   real 0m20.783s
   real 0m20.773s
   max-workers: 32
   real 0m21.290s
   real 0m21.077s
   real 0m21.037s
   ```
   
   Faster. Note I didn't do anything here - all the heavy lifting is done by 
the same implementation
   in google-java-format. The difference is in the long-tail of the longest 
operation (formatting 
   lucene/core), which is now parallel.
   
   I also toyed with removing "-XX:TieredStopAtLevel=1" from gradle.properties, 
then re-ran the benchmark:
   ```
   ./gradlew -q clean 
   ./gradlew -q --stop
   for workers in 8 16 32; do
     echo "max-workers: $workers"
     (for i in `seq 1 3`; do time ./gradlew clean checkGoogleJavaFormat 
--max-workers $workers ; done ) 2>&1 | grep "real"
   done
   ```
   results:
   ```
   max-workers: 8
   real 0m16.147s
   real 0m6.014s
   real 0m5.939s
   max-workers: 16
   real 0m4.830s
   real 0m4.700s
   real 0m4.716s
   max-workers: 32
   real 0m5.271s
   real 0m5.228s
   real 0m5.243s
   ```
   
   So you get that initial "hit" when hotspot compiles all that code in the 
daemon, then it's a bit faster
   compared to what we currently have as the default. I don't know what's 
better.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to