zacharymorn commented on PR #12194: URL: https://github.com/apache/lucene/pull/12194#issuecomment-1496995523
> Hmm, note that the actual QPS is varying quite a bit every time. In your luceneutil run, are you fixing the random seed so the same queries are used every time? Yeah indeed. I didn't fix the random seed during my luceneutil runs, and thus the results vary a lot as they may depend on the index and queries under test. > It is odd that `PKLookup` performance drops too. I did a few more testings for this, and have some interesting findings: #### No changes (comparing baseline with baseline) : ``` Task: AndHighNotMonth: +its -monthPostings:apr # freq=1160703 TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value AndHighNotMonth 62.41 (9.4%) 62.45 (7.7%) 0.1% ( -15% - 18%) 0.979 PKLookup 176.62 (28.2%) 177.03 (33.2%) 0.2% ( -47% - 85%) 0.981 ``` ``` Task: AndHighNotMonth: +its -monthPostings:apr # freq=1160703 TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value PKLookup 175.53 (25.3%) 166.38 (26.2%) -5.2% ( -45% - 62%) 0.522 AndHighNotMonth 60.36 (17.1%) 62.29 (9.0%) 3.2% ( -19% - 35%) 0.459 ``` PKLookup seems varies a lot as well when there are no changes. #### With changes (comparing modified with baseline), and also modify task query: ``` Task: AndHighNotMonth: +its -monthPostings:apr # freq=1160703 TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value PKLookup 182.12 (28.8%) 84.26 (8.4%) -53.7% ( -70% - -23%) 0.000 AndHighNotMonth 64.22 (8.4%) 160.51 (59.1%) 149.9% ( 75% - 237%) 0.000 ``` ``` Task: AndHighNotMonth: +its -monthPostings:jan # freq=1160703 TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value PKLookup 81.40 (17.4%) 91.44 (45.6%) 12.3% ( -43% - 91%) 0.258 AndHighNotMonth 116.74 (9.2%) 160.54 (45.8%) 37.5% ( -15% - 101%) 0.000 ``` ``` Task: AndHighNotMonth: +its -monthPostings:may # freq=1160703 TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value PKLookup 80.18 (6.3%) 74.90 (9.4%) -6.6% ( -20% - 9%) 0.009 AndHighNotMonth 92.19 (12.6%) 144.56 (23.6%) 56.8% ( 18% - 106%) 0.000 ``` ``` No task, and only PKLookup is run TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value PKLookup 128.55 (27.2%) 142.59 (36.9%) 10.9% ( -41% - 103%) 0.286 ``` In addition, I noticed adding `-Xbatch` JVM argument will actually make the -50% slow down go away (and also boost PKLookup's QPS): `localconstants.py` ``` if 'JAVA_EXE' not in globals(): JAVA_EXE = 'java' if 'JAVAC_EXE' not in globals(): JAVAC_EXE = 'javac' if 'JAVA_COMMAND' not in globals(): JAVA_COMMAND = '%s -Xbatch' % JAVA_EXE ``` ``` Task: AndHighNotMonth: +its -monthPostings:apr # freq=1160703 TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value PKLookup 328.59 (10.2%) 347.16 (7.2%) 5.7% ( -10% - 25%) 0.043 AndHighNotMonth 60.21 (5.4%) 160.46 (41.8%) 166.5% ( 113% - 225%) 0.000 ``` I suspect it's indeed JVM compilation that's causing the difference? Below is the full jvm command line from modified `localconstants` above and printed out by benchmark in case it will be useful: ``` java -Xbatch -XX:StartFlightRecording=dumponexit=true,maxsize=250M,settings=/Users/xichen/IdeaProjects/benchmarks/util/src/python/profiling.jfc,filename=/Users/xichen/IdeaProjects/benchmarks/logs/bench-search-baseline_vs_patch-my_modified_version-19.jfr -XX:+UnlockDiagnosticVMOptions -XX:+DebugNonSafepoints -classpath /Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/core/build/libs/lucene-core-10.0.0-SNAPSHOT.jar:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/sandbox/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/misc/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/facet/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/analysis/common/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/analysis/icu/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/queryparser/build/classes /java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/grouping/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/suggest/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/highlighter/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/codecs/build/classes/java/main:/Users/xichen/IdeaProjects/benchmarks/lucene_candidate/lucene/queries/build/classes/java/main:/Users/xichen/.gradle/caches/modules-2/files-2.1/com.carrotsearch/hppc/0.9.1/4bf4c51e06aec600894d841c4c004566b20dd357/hppc-0.9.1.jar:/Users/xichen/IdeaProjects/benchmarks/util/lib/HdrHistogram.jar:/Users/xichen/IdeaProjects/benchmarks/util/build perf.SearchPerfTest -dirImpl MMapDirectory -indexPath /Users/xichen/IdeaProjects/benchmarks/indices/wikimedium10m.lucene_baseline.facets.taxonomy:Date.taxonomy:Month.taxonomy:DayOfYear.sortedset:Date.sortedset:Month.sortedset:DayOfYear.Lucene90.Lucene90.dvfields.sor t=month:custom.nd10M -facets taxonomy:Date;Date -facets taxonomy:Month;Month -facets taxonomy:DayOfYear;DayOfYear -facets sortedset:Date;Date -facets sortedset:Month;Month -facets sortedset:DayOfYear;DayOfYear -analyzer StandardAnalyzer -taskSource /Users/xichen/IdeaProjects/benchmarks/util/tasks/wikimedium.10M.nostopwords.tasks -searchThreadCount 2 -taskRepeatCount 20 -field body -tasksPerCat 1 -staticSeed -2249101 -seed -4093553 -similarity BM25Similarity -commit multi -hiliteImpl FastVectorHighlighter -log /Users/xichen/IdeaProjects/benchmarks/logs/baseline_vs_patch.my_modified_version.19 -topN 100 -pk ``` In terms of code, PKLookup will execute this [section of modified code](https://github.com/apache/lucene/pull/12194/files#diff-900619bac18cb1e2e177533efe157e9b4707d0c855180f535051f0d955828306R530-R543) when its doing [doc enumeration](https://github.com/mikemccand/luceneutil/blob/2c8ccdf53e93622761a545c1a54377514c338caa/src/main/perf/PKLookupTask.java#L111), but reverting changes there didn't solve the issue. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org