[ https://issues.apache.org/jira/browse/LUCENE-9378?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17114706#comment-17114706 ]
Michael Sokolov commented on LUCENE-9378: ----------------------------------------- This is the output of the "normal" wikimediumall tests plus one additional task I'm proposing to add for the BDV sorting case: || Task|| QPS before|| StdDev|| QPS after|| StdDev|| Pct diff|| | HighTermTitleBDVSort| 5.65| (2.9%)| 1.03| (0.7%)| -81.7% ( -82% - -80%)| | Respell| 95.20| (1.9%)| 75.76| (1.2%)| -20.4% ( -23% - -17%)| | Fuzzy2| 83.74| (6.1%)| 79.80| (3.8%)| -4.7% ( -13% - 5%)| | HighIntervalsOrdered| 3.45| (1.8%)| 3.31| (2.5%)| -4.0% ( -8% - 0%)| | Wildcard| 95.69| (11.9%)| 91.86| (11.6%)| -4.0% ( -24% - 22%)| | OrHighNotMed| 735.62| (3.3%)| 707.58| (3.3%)| -3.8% ( -10% - 2%)| | LowTerm| 1409.80| (5.4%)| 1356.41| (3.7%)| -3.8% ( -12% - 5%)| | OrHighNotLow| 602.60| (3.8%)| 582.17| (3.6%)| -3.4% ( -10% - 4%)| | MedTerm| 1067.74| (3.4%)| 1034.85| (3.0%)| -3.1% ( -9% - 3%)| |BrowseDayOfYearSSDVFacets| 3.23| (5.0%)| 3.14| (10.0%)| -2.6% ( -16% - 13%)| |BrowseDayOfYearTaxoFacets| 5096.02| (2.9%)| 4962.76| (3.0%)| -2.6% ( -8% - 3%)| | BrowseMonthTaxoFacets| 5131.16| (2.9%)| 5007.29| (3.2%)| -2.4% ( -8% - 3%)| | MedPhrase| 453.30| (3.1%)| 446.26| (2.3%)| -1.6% ( -6% - 3%)| | BrowseMonthSSDVFacets| 3.91| (11.4%)| 3.86| (13.0%)| -1.0% ( -22% - 26%)| | HighPhrase| 37.99| (3.1%)| 37.59| (3.4%)| -1.0% ( -7% - 5%)| | OrNotHighLow| 552.20| (3.7%)| 546.81| (3.1%)| -1.0% ( -7% - 6%)| | OrHighNotHigh| 679.14| (2.8%)| 673.74| (3.2%)| -0.8% ( -6% - 5%)| | MedSpanNear| 34.13| (2.2%)| 33.91| (3.4%)| -0.6% ( -6% - 5%)| | OrHighLow| 219.19| (2.9%)| 217.83| (2.9%)| -0.6% ( -6% - 5%)| | OrHighMed| 25.65| (2.9%)| 25.51| (2.4%)| -0.6% ( -5% - 4%)| | LowSloppyPhrase| 62.29| (5.0%)| 61.96| (5.9%)| -0.5% ( -10% - 10%)| | AndHighLow| 387.57| (2.6%)| 385.73| (3.6%)| -0.5% ( -6% - 5%)| | Prefix3| 60.91| (4.1%)| 60.68| (3.4%)| -0.4% ( -7% - 7%)| | HighTerm| 806.80| (5.8%)| 804.20| (4.8%)| -0.3% ( -10% - 10%)| | LowPhrase| 59.60| (3.4%)| 59.46| (3.2%)| -0.2% ( -6% - 6%)| | LowSpanNear| 36.39| (2.0%)| 36.38| (2.6%)| -0.0% ( -4% - 4%)| | HighSloppyPhrase| 4.79| (8.8%)| 4.79| (9.2%)| 0.0% ( -16% - 19%)| | OrHighHigh| 8.88| (3.2%)| 8.89| (2.5%)| 0.1% ( -5% - 5%)| | MedSloppyPhrase| 19.58| (4.8%)| 19.61| (4.2%)| 0.1% ( -8% - 9%)| | Fuzzy1| 110.96| (6.5%)| 111.14| (4.2%)| 0.2% ( -9% - 11%)| | AndHighMed| 20.09| (4.5%)| 20.15| (4.8%)| 0.3% ( -8% - 10%)| | AndHighHigh| 13.78| (4.5%)| 13.82| (6.2%)| 0.3% ( -9% - 11%)| | HighSpanNear| 1.53| (2.6%)| 1.53| (3.5%)| 0.3% ( -5% - 6%)| | HighTermDayOfYearSort| 16.38| (7.4%)| 16.47| (6.2%)| 0.5% ( -12% - 15%)| | HighTermMonthSort| 14.09| (5.5%)| 14.19| (5.1%)| 0.8% ( -9% - 12%)| | IntNRQ| 37.12| (1.9%)| 37.47| (0.8%)| 0.9% ( -1% - 3%)| | OrNotHighMed| 615.21| (2.6%)| 624.34| (2.5%)| 1.5% ( -3% - 6%)| | OrNotHighHigh| 475.15| (3.3%)| 486.16| (2.7%)| 2.3% ( -3% - 8%)| | BrowseDateTaxoFacets| 0.61| (5.2%)| 0.96| (4.5%)| 57.7% ( 45% - 71%)| > Configurable compression for BinaryDocValues > -------------------------------------------- > > Key: LUCENE-9378 > URL: https://issues.apache.org/jira/browse/LUCENE-9378 > Project: Lucene - Core > Issue Type: Improvement > Reporter: Viral Gandhi > Priority: Minor > > Lucene 8.5.1 includes a change to always [compress > BinaryDocValues|https://issues.apache.org/jira/browse/LUCENE-9211]. This > caused (~30%) reduction in our red-line QPS (throughput). > We think users should be given some way to opt-in for this compression > feature instead of always being enabled which can have a substantial query > time cost as we saw during our upgrade. [~mikemccand] suggested one possible > approach by introducing a *mode* in Lucene80DocValuesFormat (COMPRESSED and > UNCOMPRESSED) and allowing users to create a custom Codec subclassing the > default Codec and pick the format they want. > Idea is similar to Lucene50StoredFieldsFormat which has two modes, > Mode.BEST_SPEED and Mode.BEST_COMPRESSION. > Here's related issues for adding benchmark covering BINARY doc values > query-time performance - [https://github.com/mikemccand/luceneutil/issues/61] -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org