ldematte opened a new pull request, #16062:
URL: https://github.com/apache/lucene/pull/16062

   `FlatFieldVectorsWriter<T>` exposes its accumulated vectors via `List<T> 
getVectors()`. 
   This is both too generic and too prescriptive:
   - `List<T>` binds the return type to specific vector representation (`T = 
float[] / byte[]` heap arrays, one vector per list element); if a 
`FlatFieldVectorsWriter` implementation is backed by a different data structure 
(off-heap / paged / memory-mapped storage), it has to materialize a heap array 
even when the downstream consumer would be happy with a non-materialized view. 
   -it does not carry enough information with it (e.g. encoding, dimensions for 
sub-byte packed types, etc.). In fact, current consumes have to wrap that list 
into a `KnnVectorValues` to feed the scoring / graph-building path (see e.g. 
`Lucene99HnswVectorsWriter.FieldWriter`). 
   
   This PR proposes to move this step directly to `FlatFieldVectorsWriter<T>`, 
exposing its vectors _also_ as `KnnVectorValues`.
   
   Why `KnnVectorValues`?
   
   `KnnVectorValues` is the shape current callers (e.g. scorers) actually use. 
This PR updates `Lucene99HnswVectorsWriter` to use the new function when it 
calls `FlatVectorsScorer.getRandomVectorScorerSupplier`, which already takes 
`KnnVectorValues`. 
   No new Lucene type is needed; an alternative would be to introduce a new 
Lucene interface (e.g. `FlatVectorsView`). That would be very similar to  
`KnnVectorValues` with minor variations.
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to