Tony-X commented on issue #12358:
URL: https://github.com/apache/lucene/issues/12358#issuecomment-1585758064

   Just caught up on this thread -- the design tenet of the current benchmark 
game is to measure time taken to do the same work in contention-free 
environment.
   
   As of now I'm still trying to build trust of the benchmarks so thank you for 
your evaluation and feedbacks @uschindler ! 
   
   So far I believe there are doing the "same" work as I have chased down a few 
tokenization issues. Right now the indexes on both side have  -- 
   * almost "same" tokenization -- split by whitespaces and remove tokens with 
length >=256
   * same index sort
   * same set of deleted docs (2% in total)
   * single segment 
   
   Regarding the JVM here is what we do now 
   * warm up the JVM with 6.1k query for each `COUNT` and `TOP_10_COUNT`.  We 
could increase the warmup iterations easily 
[here](https://github.com/Tony-X/search-benchmark-game/blob/4402d42c906830e85d8d79a30ae776f204ade770/Makefile#L18).
 As I was typing, I already changed warmup iter to 3 and kicked off a run.
   
   Admittedly we haven't looked into playing with different JVM arguments. 
@mikemccand thanks for creating 
https://github.com/Tony-X/search-benchmark-game/issues/37 to explore the heap 
sizes :) 
   
   IMO, GC here is less of an issue since we measure the best latency (min) 
across 10 runs for each query (a slight favor for JVM). The probability that 
every 10 of 10 run of the same query hit an GC is very tiny.
   
   It would be great to share your insights about an optimal JVM setting for 
this case.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to