HoustonPutman opened a new issue, #13918:
URL: https://github.com/apache/lucene/issues/13918

   ### Description
   
   Currently in `ScalarQuantizer`, `ScalarQuantizer.fromVectorsAutoInterval()` 
will issue 4 calls (per to scratch-batch, basically `len(vector)/20`) 
`Selector.select()` and `ScalarQuantizer.fromVectors()` will issue 2 calls. All 
of these 4/2 calls use the same vectors, just asking for different `k` values. 
If we use a `multi-select` algorithm, instead of separate `select` algorithms, 
we can speed up these calls, especially 
`ScalarQuantizer.fromVectorsAutoInterval()` which is repeating a lot of logic.
   
   The size of the list to select from is practically `20*vector_dimensions`, 
so this greater speed ups will be observed with larger dimensionality. (Or if 
`ScalarQuantizer.SCRATCH_SIZE` is ever increased)


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to