aliciavargas opened a new issue, #14168:
URL: https://github.com/apache/lucene/issues/14168

   ### Description
   
   The Lucene docs specify that wildcard search is only supported for single 
terms but not phrases 
([link](https://lucene.apache.org/core/8_10_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Wildcard_Searches)).
 We’d like to support wildcard search in phrase queries in our Lucene-based 
service and in our research we came across the 
[PhraseWildcardQuery](https://lucene.apache.org/core/8_10_0/sandbox/org/apache/lucene/search/PhraseWildcardQuery.html),
 which we see is marked as experimental.
   
   The PhraseWildcardQuery can support our needs, and we tested it out in a 
proof of concept, but it is not actually used by any of the existing lucene 
query parsers. We see some options to implement query parsing that supports 
phrase wildcards: (1) the StandardQueryParser could be modified to optionally 
allow for phrases with wildcards and utilize the PhraseWildcardQuery to support 
this use case, or (2) there could there be a new query parser that specifically 
supports phrases with wildcards.
   
   So we have two questions:
   
   1. **What would it take to make PhraseWildcardQuery fully supported (no 
longer marked experimental)?**
   2. **Can Lucene provide a query parser that utilizes the 
PhraseWildcardQuery?**
   
   
   For reference, this is how we added to the StandardQueryParser to support 
wildcards in phrases in our proof of concept:
   ```
   luceneParser =
       new 
org.apache.lucene.queryparser.flexible.standard.StandardQueryParser(analyzer);
   luceneParser.setDefaultOperator(operator);
   
   StandardQueryTreeBuilder builder = new StandardQueryTreeBuilder();
   builder.setBuilder(PhraseWildcardQueryNode.class, new 
PhraseWildcardQueryNodeBuilder());
   luceneParser.setQueryBuilder(builder);
   
   StandardQueryNodeProcessorPipeline processor =
       new 
StandardQueryNodeProcessorPipeline(luceneParser.getQueryConfigHandler());
   processor.add(new 
PhraseWildcardQueryNodeProcessor(luceneParser.getQueryConfigHandler()));
   luceneParser.setQueryNodeProcessor(processor);
   
   ```
   Using the following new components:
   
   - **PhraseWildcardQueryNode** - represents a query node for a phrase query 
with wildcards
   - **PhraseWildcardQueryNodeProcessor** - a query node processor that can be 
added to the end of the StandardQueryNodeProcessorPipeline. If it receives a 
phrase query node, it will go through its child nodes and build a new list of 
child nodes that are processed by the WildcardQueryNodeProcessor. If any of 
those new children have wildcards, it will replace the original phrase query 
node with a PhraseWildcardQueryNode with the new list of children.
     - This could be turned on or off by a config parameter
   - **PhraseWildcardQueryNodeBuilder** - a query builder that converts a 
PhraseWildcardQueryNode into a PhraseWildcardQuery using the 
[PhraseWildcardQuery.Builder](https://github.com/apache/lucene/blob/f2e7ae40af0b28b1d5f2edc31f8858229a8523f4/lucene/sandbox/src/java/org/apache/lucene/sandbox/search/PhraseWildcardQuery.java#L576)
 class. It iterates over the children of the PhraseWildcardQueryNode and for 
each child:
     - If it's a WildcardQueryNode, then we add it as a MultiTermQuery to the 
PhraseWildcardQuery.Builder
     - Else we add the single term to the PhraseWildcardQuery.Builder
   
   


-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org

For queries about this service, please contact Infrastructure at:
us...@infra.apache.org


---------------------------------------------------------------------
To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org
For additional commands, e-mail: issues-h...@lucene.apache.org

Reply via email to