zacharymorn commented on PR #12194:
URL: https://github.com/apache/lucene/pull/12194#issuecomment-1496995523

   > Hmm, note that the actual QPS is varying quite a bit every time. In your 
luceneutil run, are you fixing the random seed so the same queries are used 
every time?
   
   Yeah indeed. I didn't fix the random seed during my luceneutil runs, and 
thus the results vary a lot as they may depend on the index and queries under 
test.
   
   > It is odd that `PKLookup` performance drops too.
   
   I did a few more testings for this, and have some interesting findings:
   
   #### No changes (comparing baseline with baseline) : 
   ```
   Task: AndHighNotMonth: +its -monthPostings:apr #  freq=1160703
   
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                    AndHighNotMonth       62.41      (9.4%)       62.45      
(7.7%)    0.1% ( -15% -   18%) 0.979
                           PKLookup      176.62     (28.2%)      177.03     
(33.2%)    0.2% ( -47% -   85%) 0.981
   ```
   ```
   Task: AndHighNotMonth: +its -monthPostings:apr #  freq=1160703
   
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                           PKLookup      175.53     (25.3%)      166.38     
(26.2%)   -5.2% ( -45% -   62%) 0.522
                    AndHighNotMonth       60.36     (17.1%)       62.29      
(9.0%)    3.2% ( -19% -   35%) 0.459
   ```
   
   PKLookup seems varies a lot as well when there are no changes. 
   
   #### With changes (comparing modified with baseline), and also modify task 
query:  
   ```
   Task: AndHighNotMonth: +its -monthPostings:apr #  freq=1160703
   
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                           PKLookup      182.12     (28.8%)       84.26      
(8.4%)  -53.7% ( -70% -  -23%) 0.000
                    AndHighNotMonth       64.22      (8.4%)      160.51     
(59.1%)  149.9% (  75% -  237%) 0.000
   ```
   ```
   Task: AndHighNotMonth: +its -monthPostings:jan #  freq=1160703
   
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                           PKLookup       81.40     (17.4%)       91.44     
(45.6%)   12.3% ( -43% -   91%) 0.258
                    AndHighNotMonth      116.74      (9.2%)      160.54     
(45.8%)   37.5% ( -15% -  101%) 0.000
   ```
   ```
   Task: AndHighNotMonth: +its -monthPostings:may #  freq=1160703
   
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                           PKLookup       80.18      (6.3%)       74.90      
(9.4%)   -6.6% ( -20% -    9%) 0.009
                    AndHighNotMonth       92.19     (12.6%)      144.56     
(23.6%)   56.8% (  18% -  106%) 0.000
   ```
   ```
   No task, and only PKLookup is run
   
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                           PKLookup      128.55     (27.2%)      142.59     
(36.9%)   10.9% ( -41% -  103%) 0.286
   ```
   
   In addition, I noticed adding `-Xbatch` JVM argument will actually make the 
-50% slow down go away (and also boost PKLookup's QPS):
   
   `localconstants.py`
   ```
   if 'JAVA_EXE' not in globals():
       JAVA_EXE = 'java'
   if 'JAVAC_EXE' not in globals():
       JAVAC_EXE = 'javac'
   if 'JAVA_COMMAND' not in globals():
       JAVA_COMMAND = '%s -Xbatch' % JAVA_EXE
   ```
   ```
   Task: AndHighNotMonth: +its -monthPostings:apr #  freq=1160703
   
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
                           PKLookup      328.59     (10.2%)      347.16      
(7.2%)    5.7% ( -10% -   25%) 0.043
                    AndHighNotMonth       60.21      (5.4%)      160.46     
(41.8%)  166.5% ( 113% -  225%) 0.000
   ```
   
   I suspect it's indeed JVM compilation that's causing the difference? Below 
is the full jvm command line from modified `localconstants` above and printed 
out by benchmark in case it will be useful:
   
   ```
   java -Xbatch 
-XX:StartFlightRecording=dumponexit=true,maxsize=250M,settings=/Users/xichen/IdeaProjects/benchmarks/util/src/python/profiling.jfc,filename=/Users/xichen/IdeaProjects/benchmarks/logs/bench-search-baseline_vs_patch-my_modified_version-19.jfr
 -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints -classpath 
/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/core/build/libs/lucene-core-10.0.0-SNAPSHOT.jar:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/sandbox/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/misc/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/facet/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/analysis/common/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/analysis/icu/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/queryparser/build/classes
 
/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/grouping/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/suggest/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/highlighter/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/codecs/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/queries/build/classes/java/main:/Users/xichen/.gradle/caches/modules-2/files-2.1/com.carrotsearch/hppc/0.9.1/4bf4c51e06aec600894d841c4c004566b20dd357/hppc-0.9.1.jar:/Users/xichen/IdeaProjects/benchmarks/util/lib/HdrHistogram.jar:/Users/xichen/IdeaProjects/benchmarks/util/build
 perf.SearchPerfTest -dirImpl MMapDirectory -indexPath 
/Users/xichen/IdeaProjects/benchmarks/indices/wikimedium10m.lucene_baseline.facets.taxonomy:Date.taxonomy:Month.taxonomy:DayOfYear.sortedset:Date.sortedset:Month.sortedset:DayOfYear.Lucene90.Lucene90.dvfields.sor
 t=month:custom.nd10M -facets taxonomy:Date;Date -facets taxonomy:Month;Month 
-facets taxonomy:DayOfYear;DayOfYear -facets sortedset:Date;Date -facets 
sortedset:Month;Month -facets sortedset:DayOfYear;DayOfYear -analyzer 
StandardAnalyzer -taskSource 
/Users/xichen/IdeaProjects/benchmarks/util/tasks/wikimedium.10M.nostopwords.tasks
 -searchThreadCount 2 -taskRepeatCount 20 -field body -tasksPerCat 1 
-staticSeed -2249101 -seed -4093553 -similarity BM25Similarity -commit multi 
-hiliteImpl FastVectorHighlighter -log 
/Users/xichen/IdeaProjects/benchmarks/logs/baseline_vs_patch.my_modified_version.19
 -topN 100 -pk
   ```
   
   In terms of code, PKLookup will execute this [section of modified 
code](https://github.com/apache/lucene/pull/12194/files#diff-900619bac18cb1e2e177533efe157e9b4707d0c855180f535051f0d955828306R530-R543)
 when its doing [doc 
enumeration](https://github.com/mikemccand/luceneutil/blob/2c8ccdf53e93622761a545c1a54377514c338caa/src/main/perf/PKLookupTask.java#L111),
 but reverting changes there didn't solve the issue. 
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to