msokolov commented on issue #14758:
URL: https://github.com/apache/lucene/issues/14758#issuecomment-3319284619

   I agree that use case (3) above is the target (user knowingly passes 
duplicate vector).
   
   Then it bothers me that we don't make use of the information that one of the 
fields is strictly a subset of the other (as it is filtered), and that we 
require callers to supply the same vectors twice while indexing, and then, 
ignoring the information, try to recreate it through the complexity of sorting 
vector data (and requiring codec changes to store vector data globally, sharing 
it in ways we didn't really anticipate, ie unrelated vector fields will now be 
stored together). 
   
   I guess I'm still stuck on the "view" idea. What would be the problem with 
having one field reference another? I guess when creating a reader for the 
"view" field it would have to delegate to a flat vectors reader from the other 
field, and it would require an ordinal mapping (so that its graph can have 
dense ordinals while still referring to ordinals from the other field for value 
lookups), or else support graphs with non-dense ordinals.  Aside from all that, 
does Lucene somehow have a guarantee that there are no inter-field 
dependencies?  I can't think of any way it does.  Its indexing is entirely 
document-centric; it enforces that the field definitions are immutable (so if 
field B depends on A and A's definition changes that could conceivably be a 
problem, but it doesn't arise). 


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to