[
https://issues.apache.org/jira/browse/LUCENE-10204?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17528507#comment-17528507
]
Greg Miller commented on LUCENE-10204:
--------------------------------------
Yeah +1 to not pursuing further right now. Because various query evaluation
optimizations (current and possibly future) mean that not all children will
necessarily be visited in a complex disjunction clause when determining
matching parents, I think it's fundamentally flawed to try to track all child
hits while evaluating the query. For example, in BMW, a sub-clause may never
get advanced to a given parent match if it's determined to be a match based on
a minimum number of other clauses confirming the match.
>From what I can tell, the only accurate way to find all child matches is to
>issue a separate query that identifies them, and doesn't "join" to the parents.
> Support iteration of sub-matches in join queries (ToParentBlockJoinQuery /
> ToChildBlockJoinQuery)
> -------------------------------------------------------------------------------------------------
>
> Key: LUCENE-10204
> URL: https://issues.apache.org/jira/browse/LUCENE-10204
> Project: Lucene - Core
> Issue Type: Improvement
> Components: modules/join
> Reporter: Greg Miller
> Priority: Minor
>
> It would be nice to be able to iterate over the "sub-matches" in these join
> queries for the purpose of faceting (or possibly other use-cases?).
> For example, we have a use-case where our query matches on "child" docs,
> using a {{ToParentBlockJoinQuery}} to "emit" the associated parents, which
> are ultimately added to our match set. But, we want to iterate over the
> matching "children" for the purpose of faceting.
> To make it concrete, consider searching over a product catalog where "offers"
> and "items" are indexed side-by-side, with the offers being represented as
> "children" of the parent items. An offer contains information like
> "condition" (new vs. used), selling price, etc. for the parent item. If we
> want to facet on "condition", we want to observe all children that matched
> the query to know if the parent item had a "new" or "used" offer (or both).
> This requires iterating over the child matches when faceting, which we cannot
> do today since the child hit information isn't retained anywhere.
> We can support this by "caching" the child hits in a bitset but there is some
> complexity when multiple join queries appear in a query structure (would need
> to logically combine various "cached" bitsets using the same boolean
> operations as in the original query structure).
--
This message was sent by Atlassian Jira
(v8.20.7#820007)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]