[GitHub] [lucene] gf2121 opened a new issue, #11891: DocValuesProducer#getMergeInstance can not speed up Browse*Facets

GitBox Fri, 28 Oct 2022 12:46:27 -0700


gf2121 opened a new issue, #11891:
URL: https://github.com/apache/lucene/issues/11891


   ### Description
   
   In https://github.com/apache/lucene/issues/11202 we introduced a 
MergeInstance of DocValuesProducer to speed up merge by bulk decoding. If i 
understood correctly, MergeInstance should run faster than `DirectPackedReader` 
when value reading is dense. So I tried to use MergeInstance to speed up 
`FastTaxonomyFacetCounts#countAll` like this:
   ```
   LeafReader leafReader = context.reader();
   SortedNumericDocValues multiValued;
   
   if (leafReader instanceof CodecReader) {
     FieldInfo fieldInfo = leafReader.getFieldInfos().fieldInfo(indexFieldName);
     multiValued = ((CodecReader) 
leafReader).getDocValuesReader().getMergeInstance().getSortedNumeric(fieldInfo);
   } else {
     multiValued = context.reader().getSortedNumericDocValues(indexFieldName);
   }
   
   ...
   ```
   
   Results is a bit disappointing:
   ```
                               TaskQPS baseline      StdDevQPS 
my_modified_version      StdDev                Pct diff p-value
              BrowseMonthTaxoFacets       11.66     (10.3%)        4.82      
(6.9%)  -58.7% ( -68% -  -46%) 0.000
        BrowseRandomLabelTaxoFacets        6.96     (15.6%)        4.18     
(11.8%)  -39.9% ( -58% -  -14%) 0.000
               BrowseDateTaxoFacets        9.96     (10.1%)        6.13     
(12.2%)  -38.5% ( -55% -  -17%) 0.000
          BrowseDayOfYearTaxoFacets       10.17      (8.5%)        6.46      
(9.3%)  -36.5% ( -50% -  -20%) 0.000
   ```
   
   I wonder if MergeInstance can run slower for some bpv even if the value 
reading is dense? Maybe we can optimize for that?  Forgive me if i 
misunderstood something :)
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

[GitHub] [lucene] gf2121 opened a new issue, #11891: DocValuesProducer#getMergeInstance can not speed up Browse*Facets

Reply via email to