aliciavargas opened a new issue, #14168: URL: https://github.com/apache/lucene/issues/14168
### Description The Lucene docs specify that wildcard search is only supported for single terms but not phrases ([link](https://lucene.apache.org/core/8_10_0/queryparser/org/apache/lucene/queryparser/classic/package-summary.html#Wildcard_Searches)). We’d like to support wildcard search in phrase queries in our Lucene-based service and in our research we came across the [PhraseWildcardQuery](https://lucene.apache.org/core/8_10_0/sandbox/org/apache/lucene/search/PhraseWildcardQuery.html), which we see is marked as experimental. The PhraseWildcardQuery can support our needs, and we tested it out in a proof of concept, but it is not actually used by any of the existing lucene query parsers. We see some options to implement query parsing that supports phrase wildcards: (1) the StandardQueryParser could be modified to optionally allow for phrases with wildcards and utilize the PhraseWildcardQuery to support this use case, or (2) there could there be a new query parser that specifically supports phrases with wildcards. So we have two questions: 1. **What would it take to make PhraseWildcardQuery fully supported (no longer marked experimental)?** 2. **Can Lucene provide a query parser that utilizes the PhraseWildcardQuery?** For reference, this is how we added to the StandardQueryParser to support wildcards in phrases in our proof of concept: ``` luceneParser = new org.apache.lucene.queryparser.flexible.standard.StandardQueryParser(analyzer); luceneParser.setDefaultOperator(operator); StandardQueryTreeBuilder builder = new StandardQueryTreeBuilder(); builder.setBuilder(PhraseWildcardQueryNode.class, new PhraseWildcardQueryNodeBuilder()); luceneParser.setQueryBuilder(builder); StandardQueryNodeProcessorPipeline processor = new StandardQueryNodeProcessorPipeline(luceneParser.getQueryConfigHandler()); processor.add(new PhraseWildcardQueryNodeProcessor(luceneParser.getQueryConfigHandler())); luceneParser.setQueryNodeProcessor(processor); ``` Using the following new components: - **PhraseWildcardQueryNode** - represents a query node for a phrase query with wildcards - **PhraseWildcardQueryNodeProcessor** - a query node processor that can be added to the end of the StandardQueryNodeProcessorPipeline. If it receives a phrase query node, it will go through its child nodes and build a new list of child nodes that are processed by the WildcardQueryNodeProcessor. If any of those new children have wildcards, it will replace the original phrase query node with a PhraseWildcardQueryNode with the new list of children. - This could be turned on or off by a config parameter - **PhraseWildcardQueryNodeBuilder** - a query builder that converts a PhraseWildcardQueryNode into a PhraseWildcardQuery using the [PhraseWildcardQuery.Builder](https://github.com/apache/lucene/blob/f2e7ae40af0b28b1d5f2edc31f8858229a8523f4/lucene/sandbox/src/java/org/apache/lucene/sandbox/search/PhraseWildcardQuery.java#L576) class. It iterates over the children of the PhraseWildcardQueryNode and for each child: - If it's a WildcardQueryNode, then we add it as a MultiTermQuery to the PhraseWildcardQuery.Builder - Else we add the single term to the PhraseWildcardQuery.Builder -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org --------------------------------------------------------------------- To unsubscribe, e-mail: issues-unsubscr...@lucene.apache.org For additional commands, e-mail: issues-h...@lucene.apache.org