ldematte opened a new pull request, #16062: URL: https://github.com/apache/lucene/pull/16062
`FlatFieldVectorsWriter<T>` exposes its accumulated vectors via `List<T> getVectors()`. This is both too generic and too prescriptive: - `List<T>` binds the return type to specific vector representation (`T = float[] / byte[]` heap arrays, one vector per list element); if a `FlatFieldVectorsWriter` implementation is backed by a different data structure (off-heap / paged / memory-mapped storage), it has to materialize a heap array even when the downstream consumer would be happy with a non-materialized view. -it does not carry enough information with it (e.g. encoding, dimensions for sub-byte packed types, etc.). In fact, current consumes have to wrap that list into a `KnnVectorValues` to feed the scoring / graph-building path (see e.g. `Lucene99HnswVectorsWriter.FieldWriter`). This PR proposes to move this step directly to `FlatFieldVectorsWriter<T>`, exposing its vectors _also_ as `KnnVectorValues`. Why `KnnVectorValues`? `KnnVectorValues` is the shape current callers (e.g. scorers) actually use. This PR updates `Lucene99HnswVectorsWriter` to use the new function when it calls `FlatVectorsScorer.getRandomVectorScorerSupplier`, which already takes `KnnVectorValues`. No new Lucene type is needed; an alternative would be to introduce a new Lucene interface (e.g. `FlatVectorsView`). That would be very similar to `KnnVectorValues` with minor variations. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
