[ https://issues.apache.org/jira/browse/LUCENE-10204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17511359#comment-17511359 ]
Adrien Grand commented on LUCENE-10204: --------------------------------------- I have a bias against features that expose the state of sub queries because it prevents queries from doing interesting things, e.g. BS1 likes to consume doc IDs in batches, and maybe an approach like that could be beneficial to ToParentBlockJoinQuery. Another example of a problem is that ToParentBlockJoinQuery doesn't need to visit all sub matches when scores are not needed, it only needs one offer to match to know that the item matches. While faceting would generally need to know about all sub matches. I know it sounds wasteful, but in my opinion the way to got consists of evaluating the query on children a second time for the purpose of faceting instead of trying to reuse the work that is done while evaluating the query. > Support iteration of sub-matches in join queries (ToParentBlockJoinQuery / > ToChildBlockJoinQuery) > ------------------------------------------------------------------------------------------------- > > Key: LUCENE-10204 > URL: https://issues.apache.org/jira/browse/LUCENE-10204 > Project: Lucene - Core > Issue Type: Improvement > Components: modules/join > Reporter: Greg Miller > Priority: Minor > > It would be nice to be able to iterate over the "sub-matches" in these join > queries for the purpose of faceting (or possibly other use-cases?). > For example, we have a use-case where our query matches on "child" docs, > using a {{ToParentBlockJoinQuery}} to "emit" the associated parents, which > are ultimately added to our match set. But, we want to iterate over the > matching "children" for the purpose of faceting. > To make it concrete, consider searching over a product catalog where "offers" > and "items" are indexed side-by-side, with the offers being represented as > "children" of the parent items. An offer contains information like > "condition" (new vs. used), selling price, etc. for the parent item. If we > want to facet on "condition", we want to observe all children that matched > the query to know if the parent item had a "new" or "used" offer (or both). > This requires iterating over the child matches when faceting, which we cannot > do today since the child hit information isn't retained anywhere. > We can support this by "caching" the child hits in a bitset but there is some > complexity when multiple join queries appear in a query structure (would need > to logically combine various "cached" bitsets using the same boolean > operations as in the original query structure). -- This message was sent by Atlassian Jira (v8.20.1#820001) --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org