Hi *all*! First time posting! I have been struggling with Solr v4.10.2 with a PhraseQuery with wildcard!
My field definition is below: <!-- Search field --> <field name="title" type="text_pt_en" indexed="true" stored="true" /> <!-- Field definition --> <fieldType name="text_pt_en" class="solr.TextField" positionIncrementGap="100"> <analyzer type="index"> <charFilter class="solr.HTMLStripCharFilterFactory" /> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_pt.txt" format="snowball" enablePositionIncrements="true" /> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <!-- <tokenizer class="solr.KeywordTokenizerFactory" /> --> <filter class="solr.LowerCaseFilterFactory"/> <filter class="solr.ASCIIFoldingFilterFactory" preserveOriginal="false" /> <filter class="solr.ReversedWildcardFilterFactory" /> </analyzer> <analyzer type="query"> <charFilter class="solr.HTMLStripCharFilterFactory"/> <filter class="solr.StopFilterFactory" ignoreCase="true" words="lang/stopwords_pt.txt" format="snowball" enablePositionIncrements="true" /> <tokenizer class="solr.WhitespaceTokenizerFactory" /> <!-- <tokenizer class="solr.KeywordTokenizerFactory" /> --> <filter class="solr.LowerCaseFilterFactory" /> <filter class="solr.ASCIIFoldingFilterFactory" preserveOriginal="false" /> </analyzer> </fieldType> Let's suppose I have the following value added to the index of the field above (portuguese): Teste de texto; Será quebrado em espaços em branco! And the values added to the index, based on the analyzer chain will be (from Solr "Analysis"): etset teste ;otxet texto; odarbeuq quebrado socapse espacos !ocnarb branco! Today, I can search, for example: title:teste title:(teste texto) title:(teste de texto) title:("teste de texto;") // (PhraseQuery) matches because of ";" in the end of the string But, if I try to search (PhraseQuery): title:("teste de texto") "parsedquery": "PhraseQuery(title:\"teste ? texto\")" title:("teste de texto*") "parsedquery": "PhraseQuery(title:\"teste ? texto*\")" No results are returned. I have read about possible solutions to this, but none of them seems to work: MultitermQueryAnalysis Complex Phrase Query Parser And I just can't understand why the query with the wildcard in the end: "*" does not work, no results are returned. Some comments: - I don't have control over what is entered in the search, I would like it to work like a "file listing", like a "glob"; - Today I can't change my tokenizer to: "StandardTokenizerFactory" (that in this case would work), because I need to search for e-mails, words with colon, for example; - I tried the: "KeywordTokenizer", but I have the same behavior as above; - I read about: "ShingleFilterFactory", but my index would be huge, because I need to index full texts (with more than 30000 chars); - One person in stackoverflow pointed me to the documentation where it says it is not possible to use a wildcard in a phrase query using the standard query parser. I tried to use the *complexphrase: **{!complexphrase}title:"teste de texto*"*, but no results still. Am I doing something wrong? Is there anything wrong with my schema analysis? - I could make it work using: "KeywordTokenizerFactory", but it only works with "RegexpQuery": *title:(/.*teste de texto.*/)*. Do I have other options? Could you please help me understand what happens, if there is a way to make a PhraseQuery with a wildcard work and what are my options? Please, let me know if you need further information and thanks a lot for your attention and help! *Felipe*. PS: I have added the same question to stackoverflow: http://stackoverflow.com/questions/38061980/solr-phrasequery-with-wildcard