jimczi commented on PR #16214:
URL: https://github.com/apache/lucene/pull/16214#issuecomment-4678048199

   > the same deferral updateDocument / updateDocValues apply to graph / 
secondary-structure work
   
   I don’t think this is quite how it works. `updateDocValues` is only possible 
when the field is indexed purely as doc values. That’s really the trade-off 
here: we probably don’t want different update semantics depending on the 
underlying data structure.
   
   > re-embed the whole vector field after a model fine-tune (the motivating 
case).
   
   I also think this use case is a bit misleading. If the model changes, you 
generally need to update all vectors anyway, otherwise you end up mixing 
embeddings from different models and queries become inconsistent/wrong. In that 
case, reindexing into a new index in the background and switching over 
atomically once it’s done feels like the cleaner approach.
   
   As it stands, I don’t think this change is in an acceptable state yet. The 
semantics feel too tied to specific implementations/data structures, and the 
motivating use case doesn’t really justify the complexity or trade-offs being 
introduced.


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to