[ https://issues.apache.org/jira/browse/LUCENE-9855?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17313940#comment-17313940 ]
Michael Sokolov commented on LUCENE-9855: ----------------------------------------- I think it will be helpful to consider how we would handle a different ANN implementation. Say LSH. In that case, we would no longer store this graph file (what is currently in .vex files). We would need to add files to store LSH's hash tables, and the metadata would change. Is it the same format? A variant of this format? The current conception is that we would make variations of this single vector format to handle multiple ANN algorithms. We currently only have one, so it doesn't look that way, but anyway with that background a generic name like VectorsFormat (or NumericVectorsFormat to distinguish from DocVectors etc) makes sense. On the other hand if you think we would create a new Format to represent this different kind of data that we are storing to disk, which will have its own de/serialization code (even if some of it would be the same), then we should pick a name that incorporates the algorithm, and by the way also get rid of the whole concept of {{SearchStrategy}}. I think this is the fundamental question here: one format, multiple ANN strategies, or one format per ANN strategy? I thought it had been sorted out in our earlier discussions, but not everybody may have been following that closely. > Reconsider codec name VectorFormat > ---------------------------------- > > Key: LUCENE-9855 > URL: https://issues.apache.org/jira/browse/LUCENE-9855 > Project: Lucene - Core > Issue Type: Improvement > Components: core/codecs > Affects Versions: main (9.0) > Reporter: Tomoko Uchida > Priority: Blocker > > There is some discussion about the codec name for ann search. > https://lists.apache.org/thread.html/r3a6fa29810a1e85779de72562169e72d927d5a5dd2f9ea97705b8b2e%40%3Cdev.lucene.apache.org%3E > Main points here are 1) use plural form for consistency, and 2) use more > specific name for ann search (second point could be optional). > A few alternatives were proposed: > - VectorsFormat > - VectorValuesFormat > - NeighborsFormat > - DenseVectorsFormat -- This message was sent by Atlassian Jira (v8.3.4#803005) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org