jimczi commented on PR #16214: URL: https://github.com/apache/lucene/pull/16214#issuecomment-4678048199
> the same deferral updateDocument / updateDocValues apply to graph / secondary-structure work I don’t think this is quite how it works. `updateDocValues` is only possible when the field is indexed purely as doc values. That’s really the trade-off here: we probably don’t want different update semantics depending on the underlying data structure. > re-embed the whole vector field after a model fine-tune (the motivating case). I also think this use case is a bit misleading. If the model changes, you generally need to update all vectors anyway, otherwise you end up mixing embeddings from different models and queries become inconsistent/wrong. In that case, reindexing into a new index in the background and switching over atomically once it’s done feels like the cleaner approach. As it stands, I don’t think this change is in an acceptable state yet. The semantics feel too tied to specific implementations/data structures, and the motivating use case doesn’t really justify the complexity or trade-offs being introduced. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
