bjacobowitz opened a new issue, #14427:
URL: https://github.com/apache/lucene/issues/14427

   ### Description
   
   `TermFilteredPresearcher` may fail to return stored queries if those queries 
contain the filter field in the query itself (not just in the metadata).
   
   When building the presearcher query 
([here](https://github.com/apache/lucene/blob/ddef6177122d564bf2319fa72b3327ee53ad7019/lucene/monitor/src/java/org/apache/lucene/monitor/TermFilteredPresearcher.java#L99-L152))
 , `TermFilteredPresearcher` removes filter field tokens from the main clause 
of the presearcher query ([via acceptor 
here](https://github.com/apache/lucene/blob/ddef6177122d564bf2319fa72b3327ee53ad7019/lucene/monitor/src/java/org/apache/lucene/monitor/TermFilteredPresearcher.java#L115-L130))
 , presumably under the assumption that they will be added back to the query in 
a separate clause for filter fields, ANDed with the first clause 
([here](https://github.com/apache/lucene/blob/ddef6177122d564bf2319fa72b3327ee53ad7019/lucene/monitor/src/java/org/apache/lucene/monitor/TermFilteredPresearcher.java#L138-L146)).
   
   We end up with a presearcher query that is something like
   ```
   +(some_other_field_in_query_index: value) #(+(my_filter_field:value))
   ```
   
   However, if an indexed query contains that filter field, and if that field 
was the only indexed field for the query's associated document in the query 
index, the first of the AND'd clauses cannot match (because the filter field 
was omitted), so the overall AND'd presearcher query cannot match, and the 
presearcher fails to return the query.
   
   A user can work around this by using an additional dedicated field for the 
filter field (i.e. adding it on both the query metadata and the document), but 
this seems like an _easy_ trap to fall into.
   
   **My question here: is this intentional?** Is the idea of a "filter field" 
that it appears in documents and in MonitorQuery metadata but must not appear 
in a itself query? I'm aware of another issue about the intended behavior of 
Monitor filter fields (https://github.com/apache/lucene/issues/11607), so I'm 
unsure.
   
   If intentional, I think we should document that more directly. If 
unintentional, we might consider removing the check on filter fields 
([here](https://github.com/apache/lucene/blob/ddef6177122d564bf2319fa72b3327ee53ad7019/lucene/monitor/src/java/org/apache/lucene/monitor/TermFilteredPresearcher.java#L121))
 when building the first part of the presearcher query.
   
   I've set up a test project 
[here](https://github.com/bjacobowitz/sturdy-chainsaw/tree/main) to demonstrate 
the problem (specifically in [this test 
file](https://github.com/bjacobowitz/sturdy-chainsaw/blob/8e24641e2af3e9d5f19de31e78480048f19f5dd4/src/test/java/MonitorTest.java),
 where 
[`testWithEmptyMetadata`](https://github.com/bjacobowitz/sturdy-chainsaw/blob/8e24641e2af3e9d5f19de31e78480048f19f5dd4/src/test/java/MonitorTest.java#L22C10-L22C31)
 works but 
[`testWithMetadata`](https://github.com/bjacobowitz/sturdy-chainsaw/blob/8e24641e2af3e9d5f19de31e78480048f19f5dd4/src/test/java/MonitorTest.java#L52)
 fails).
   
   ### Version and environment details
   
   Tested with Lucene 10.1.0 on macOS Sequoia 15.3.2 (but I think the problem 
has been around since at least Lucene version 8, if not earlier).


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to