mikemccand commented on issue #15427: URL: https://github.com/apache/lucene/issues/15427#issuecomment-3714645165
> if we want to retain today's encoding, ordinal ordering is congruent with docid ordering We've had several ideas/PRs that want to decouple this -- multi-valued vector fields, the cool recursive BP ordering, flipping the stride during merging (so Codec sees all vector fields for one document and can dedup). Actually, if a vector field is sparse (some docs don't have a vector), we already make different ords for vectors than docids right? Oh maybe by "congruent" you mean strictly monotonically increasing (vector ord is larger if docid is larger)? > I'm thinking about how we would assign ordinals in the "new" graph. I like these ideas! But I think the improvement being explored here and in https://github.com/apache/lucene/issues/15504 would not alter the vector ords, but rather alter the connections between them? It would try to undo the "you cannot see the future" problem we have now when writing an initial HNSW graph, because each new vector only sees the vectors already indexed before, not the ones coming later to this same segment. > Another idea could be to merge using the existing ordinals and then to rewrite the ordinals when writing the on heap graph to disk. Is this how the recursive graph BP merge policy (from https://github.com/apache/lucene/issues/13565) works? -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
