alessandrobenedetti commented on PR #12314:
URL: https://github.com/apache/lucene/pull/12314#issuecomment-1602090466

   > Thinking more on this implementation. It seems like we will need at a 
minimum a new `NeighborQueue`
   > 
   > I am not sure the existing one needs to be updated, but we instead should 
have a `MultiValueNeighborQueue`.
   > 
   > The reason for this is that not only does this queue contain information 
about result sets, it keeps track of how many nodes are visited and the TopHits 
returned utilize that number. Consequently, the visited count should keep track 
of documents visited, not vectors visited. All these changes indicates a new 
queueing mechanism for multi-valued vector fields.
   > 
   > Another thought is that Lucene already has the concept of index `join` 
values. Effectively creating child document IDs under a single parent. This 
allows for even greater flexibility by indexing the passage the vector 
represents, and potentially even greater scoring flexibility.
   > 
   > The issue I could see happening here is ensuring the topdocs searching has 
the ability to deduplicate (if desired) based on parent document ID.
   > 
   > Did you consider utilizing this when digging into this implementation?
   
   I think it's a good idea to create a new dedicated MultiValued 
NeighborQueue, I'll do it when I have time but feel free to do it if you like!
   
   In regards to index time join, I am not sure it's relevant here (are you 
talking about block join?):
   isn't it a different concept from multivalued?
   i.e. we have the mechanism in Lucene along multi-valued vectors for pretty 
much all the field types, haven't we?
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to