gf2121 opened a new pull request #574: URL: https://github.com/apache/lucene/pull/574
CPU profile often tells SingletonSortedNumericDocValues#nextDoc() is using a high percentage of CPU when running luceneutil, but the nextDoc() of dense cases should be rather simple. So I suspect that it is too many layers of abstraction that cause the stress of JVM. Unwraping it to NumericDocvalues shows around 30% speed up. ``` TaskQPS baseline StdDevQPS my_modified_version StdDev Pct diff p-value HighTermTitleBDVSort 132.24 (20.6%) 125.67 (9.9%) -5.0% ( -29% - 32%) 0.330 LowTerm 1424.13 (3.2%) 1381.34 (4.4%) -3.0% ( -10% - 4%) 0.014 OrHighNotHigh 707.82 (3.3%) 687.49 (6.0%) -2.9% ( -11% - 6%) 0.062 TermDTSort 155.32 (10.9%) 151.02 (10.2%) -2.8% ( -21% - 20%) 0.406 OrNotHighMed 618.46 (3.7%) 602.65 (4.4%) -2.6% ( -10% - 5%) 0.047 Fuzzy1 76.22 (5.3%) 74.71 (6.6%) -2.0% ( -13% - 10%) 0.293 HighTermMonthSort 174.89 (10.4%) 171.45 (10.6%) -2.0% ( -20% - 21%) 0.554 OrHighNotMed 776.08 (4.9%) 761.70 (7.8%) -1.9% ( -13% - 11%) 0.367 HighTermDayOfYearSort 56.23 (10.7%) 55.26 (10.9%) -1.7% ( -21% - 22%) 0.615 MedTerm 1449.48 (3.7%) 1425.87 (5.1%) -1.6% ( -10% - 7%) 0.250 OrNotHighHigh 687.92 (4.9%) 677.06 (5.5%) -1.6% ( -11% - 9%) 0.339 OrHighNotLow 742.99 (4.7%) 732.23 (5.9%) -1.4% ( -11% - 9%) 0.390 OrNotHighLow 789.37 (2.7%) 778.80 (4.7%) -1.3% ( -8% - 6%) 0.270 HighPhrase 75.84 (2.2%) 75.14 (3.0%) -0.9% ( -6% - 4%) 0.269 HighSloppyPhrase 20.71 (5.9%) 20.56 (5.2%) -0.7% ( -11% - 11%) 0.678 IntNRQ 106.38 (18.4%) 105.67 (18.2%) -0.7% ( -31% - 44%) 0.908 OrHighMed 45.10 (1.5%) 44.83 (1.8%) -0.6% ( -3% - 2%) 0.261 MedSpanNear 192.49 (2.5%) 191.51 (3.5%) -0.5% ( -6% - 5%) 0.593 OrHighLow 489.82 (5.5%) 487.79 (5.7%) -0.4% ( -11% - 11%) 0.815 MedSloppyPhrase 27.33 (2.9%) 27.22 (2.3%) -0.4% ( -5% - 5%) 0.623 MedPhrase 208.94 (2.9%) 208.09 (3.7%) -0.4% ( -6% - 6%) 0.696 Respell 71.84 (2.4%) 71.55 (2.4%) -0.4% ( -5% - 4%) 0.600 OrHighHigh 36.26 (1.3%) 36.13 (1.1%) -0.4% ( -2% - 2%) 0.344 BrowseMonthSSDVFacets 15.95 (2.7%) 15.90 (2.5%) -0.4% ( -5% - 5%) 0.672 AndHighMed 85.83 (2.2%) 85.53 (2.7%) -0.3% ( -5% - 4%) 0.658 Prefix3 123.15 (2.6%) 122.74 (2.5%) -0.3% ( -5% - 4%) 0.678 Fuzzy2 76.41 (4.7%) 76.23 (4.2%) -0.2% ( -8% - 9%) 0.867 BrowseDayOfYearSSDVFacets 14.52 (2.4%) 14.49 (2.2%) -0.2% ( -4% - 4%) 0.747 MedIntervalsOrdered 56.39 (4.2%) 56.27 (4.1%) -0.2% ( -8% - 8%) 0.871 HighIntervalsOrdered 9.29 (4.7%) 9.27 (4.4%) -0.2% ( -8% - 9%) 0.896 AndHighMedDayTaxoFacets 119.76 (2.5%) 119.53 (2.9%) -0.2% ( -5% - 5%) 0.831 HighSpanNear 20.89 (2.0%) 20.85 (2.3%) -0.2% ( -4% - 4%) 0.803 LowIntervalsOrdered 45.51 (4.9%) 45.47 (4.8%) -0.1% ( -9% - 10%) 0.952 LowPhrase 64.17 (2.6%) 64.14 (2.6%) -0.1% ( -5% - 5%) 0.951 LowSpanNear 104.45 (2.2%) 104.41 (1.9%) -0.0% ( -4% - 4%) 0.959 Wildcard 103.83 (2.8%) 103.80 (2.8%) -0.0% ( -5% - 5%) 0.970 AndHighHigh 42.33 (2.6%) 42.33 (2.4%) -0.0% ( -4% - 5%) 0.991 BrowseRandomLabelSSDVFacets 10.62 (2.5%) 10.62 (1.8%) 0.0% ( -4% - 4%) 0.981 AndHighHighDayTaxoFacets 29.75 (2.3%) 29.76 (2.7%) 0.1% ( -4% - 5%) 0.949 MedTermDayTaxoFacets 26.56 (3.0%) 26.58 (2.5%) 0.1% ( -5% - 5%) 0.945 AndHighLow 1012.26 (4.5%) 1013.62 (4.3%) 0.1% ( -8% - 9%) 0.923 LowSloppyPhrase 78.82 (6.8%) 79.03 (6.0%) 0.3% ( -11% - 14%) 0.897 PKLookup 204.09 (3.0%) 204.82 (2.9%) 0.4% ( -5% - 6%) 0.703 OrHighMedDayTaxoFacets 14.53 (3.4%) 14.59 (2.7%) 0.4% ( -5% - 6%) 0.694 HighTerm 1607.26 (5.2%) 1623.99 (5.6%) 1.0% ( -9% - 12%) 0.543 BrowseRandomLabelTaxoFacets 11.93 (6.9%) 15.52 (2.5%) 30.1% ( 19% - 42%) 0.000 BrowseDateTaxoFacets 13.46 (9.0%) 18.28 (3.6%) 35.8% ( 21% - 53%) 0.000 BrowseDayOfYearTaxoFacets 13.59 (9.1%) 18.53 (3.6%) 36.3% ( 21% - 53%) 0.000 BrowseMonthTaxoFacets 13.93 (10.9%) 19.70 (14.9%) 41.4% ( 14% - 75%) 0.000 ``` **Baseline** ``` PERCENT CPU SAMPLES STACK 3.85% 12316 org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegment() 3.78% 12076 org.apache.lucene.util.packed.DirectReader$DirectPackedReader20#get() 3.72% 11905 org.apache.lucene.index.SingletonSortedNumericDocValues#nextDoc() 2.88% 9199 org.apache.lucene.queries.intervals.OrderedIntervalsSource$OrderedIntervalIterator#nextInterval() 2.31% 7380 org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll() 2.27% 7270 org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#ordValue() 2.25% 7211 org.apache.lucene.facet.taxonomy.IntTaxonomyFacets#increment() 2.23% 7139 org.apache.lucene.index.SingletonSortedNumericDocValues#nextValue() 1.88% 6006 java.nio.Buffer#checkIndex() 1.86% 5965 jdk.internal.misc.Unsafe#convEndian() 1.85% 5916 org.apache.lucene.util.packed.DirectReader$DirectPackedReader4#get() 1.72% 5491 org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#nextPosition() 1.49% 4780 java.nio.DirectByteBuffer#ix() 1.42% 4548 java.nio.Buffer#scope() 1.40% 4465 org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$4#longValue() 1.39% 4434 org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#advance() 1.33% 4254 org.apache.lucene.store.ByteBufferGuard#ensureValid() 1.32% 4219 org.apache.lucene.util.packed.DirectReader$DirectPackedReader12#get() 1.28% 4109 org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#nextDoc() 1.28% 4089 jdk.internal.misc.ScopedMemoryAccess#getByteInternal() 1.16% 3709 org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockImpactsPostingsEnum#advance() 1.10% 3517 org.apache.lucene.store.ByteBufferGuard#getInt() 1.07% 3427 org.apache.lucene.codecs.lucene90.Lucene90NormsProducer$3#longValue() 0.98% 3149 org.apache.lucene.search.ConjunctionDISI#doNext() 0.98% 3120 org.apache.lucene.codecs.lucene90.Lucene90PostingsReader#findFirstGreater() 0.93% 2969 org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$3#longValue() 0.92% 2927 org.apache.lucene.store.ByteBufferGuard#getByte() 0.88% 2828 com.carrotsearch.hppc.IntIntHashMap#indexOf() 0.82% 2635 org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockImpactsDocsEnum#advance() 0.82% 2633 org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer#score() ``` **Candidate** ``` PERCENT CPU SAMPLES STACK 4.15% 12823 org.apache.lucene.util.packed.DirectReader$DirectPackedReader20#get() 3.94% 12186 org.apache.lucene.facet.sortedset.SortedSetDocValuesFacetCounts#countOneSegment() 3.32% 10266 org.apache.lucene.facet.taxonomy.FastTaxonomyFacetCounts#countAll() 2.98% 9208 org.apache.lucene.queries.intervals.OrderedIntervalsSource$OrderedIntervalIterator#nextInterval() 2.38% 7351 org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#ordValue() 2.07% 6386 org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$DenseNumericDocValues#nextDoc() 1.85% 5723 org.apache.lucene.facet.taxonomy.IntTaxonomyFacets#increment() 1.81% 5600 jdk.internal.misc.Unsafe#convEndian() 1.81% 5588 org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#nextPosition() 1.75% 5409 java.nio.Buffer#checkIndex() 1.72% 5310 org.apache.lucene.util.packed.DirectReader$DirectPackedReader4#get() 1.50% 4631 java.nio.Buffer#scope() 1.44% 4437 jdk.internal.misc.ScopedMemoryAccess#getByteInternal() 1.43% 4408 org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#advance() 1.39% 4297 java.nio.DirectByteBuffer#ix() 1.39% 4280 org.apache.lucene.util.packed.DirectReader$DirectPackedReader12#get() 1.33% 4111 org.apache.lucene.store.ByteBufferGuard#ensureValid() 1.31% 4052 org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$20#nextDoc() 1.29% 3974 org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$4#longValue() 1.22% 3761 org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockImpactsPostingsEnum#advance() 1.13% 3502 org.apache.lucene.codecs.lucene90.Lucene90NormsProducer$3#longValue() 1.04% 3219 org.apache.lucene.search.ConjunctionDISI#doNext() 1.00% 3099 org.apache.lucene.codecs.lucene90.Lucene90PostingsReader#findFirstGreater() 0.99% 3067 org.apache.lucene.codecs.lucene90.Lucene90DocValuesProducer$3#longValue() 0.99% 3052 org.apache.lucene.store.ByteBufferGuard#getInt() 0.89% 2762 org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$BlockImpactsDocsEnum#advance() 0.87% 2690 org.apache.lucene.search.similarities.BM25Similarity$BM25Scorer#score() 0.86% 2663 org.apache.lucene.store.ByteBufferGuard#getByte() 0.80% 2476 org.apache.lucene.codecs.lucene90.ForUtil#expand8() 0.78% 2420 org.apache.lucene.codecs.lucene90.Lucene90PostingsReader$EverythingEnum#skipPositions() ``` -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org