Re: autoGeneratePhraseQueries not working

Alexandre Rafalovitch Tue, 16 Apr 2019 14:28:11 -0700

Ah oops. Did not realize the original text was missing spaces. Looked
like so many questions that did, I did not recheck the search query.


Go with Erick's explanation for this specific case. And keep my in
mind for input with spaces.

Regards,
   Alex.

On Tue, 16 Apr 2019 at 17:48, Erick Erickson <erickerick...@gmail.com> wrote:
>
> The issue isn’t SoW. What’s happening here is that the query _parser_ passes 
> my25word through as a single token, then WordDelimiterGraphFilterFactory 
> splits it up on number/letter changes after SoW is out of the picture. The 
> admin/analysis page will show you how this works.
>
> By fiddling with the settings in WordDelimiterGraphFilterFactory, you can get 
> close to auto phrase queries, in particular catenateall. But it’s not quite 
> the same thing under any circumstances as phrases.
>
> Best,
> Erick
>
> > On Apr 16, 2019, at 4:31 AM, Leonardo Francalanci 
> > <leoonar...@yahoo.it.INVALID> wrote:
> >
> > Thank you for the reply.
> > I'm using eDisMax, does it use the same parser as the Standard Query Parser 
> > then?
> > I think this behavior should be documented somehow... it's very confusing 
> > and to be honest I don't even remember how I got to the sow parameter... 
> > and I'm not sure what that means for all other queries I have
> >
> >    Il martedì 16 aprile 2019, 13:09:26 CEST, Alexandre Rafalovitch 
> > <arafa...@gmail.com> ha scritto:
> >
> > The issue is that the Standard Query Parser does pre-processing of the
> > query and splits it on whitespace beforehand (to deal with all the
> > special syntax). So, if you don't use quoted phrases then by the time
> > the field specific query analyzer chain kicks in, the text is already
> > pre-split and the analyzer only sees one (pre space-separated) token
> > at a time. So, the autoGeneratePhraseQueries does not work then. If
> > you use different parsers that send whole text in (e.g. FieldQParser),
> > then - I think - it will work.
> >
> > Or, like you discovered, sow=true tells the Standard Query Parser to
> > send it all together as well.
> >
> > It is a bit of a messy part of Solr, because the Admin Analysis page
> > sends the text to the query analyzer without splitting (it does not
> > use any Query Parser). So, that adds to the confusion.
> >
> > Regards,
> >   Alex.
> >
> > On Tue, 16 Apr 2019 at 10:53, Leonardo Francalanci
> > <leoonar...@yahoo.it.invalid> wrote:
> >>
> >>   To add some information: using "sow=true" it seems to work.But I don't 
> >> understand why with "sow=false" it wouldn't work (can't find anything in 
> >> the docs about sow interaction with autoGeneratePhraseQueries); and the 
> >> implication of setting saw=true.
> >> I've found this:[SOLR-9185] Solr's edismax and "Lucene"/standard query 
> >> parsers should optionally not split on whitespace before sending terms to 
> >> analysis - ASF JIRA
> >>
> >> |
> >> |
> >> |  |
> >> [SOLR-9185] Solr's edismax and "Lucene"/standard query parsers should op...
> >>
> >>
> >>   |
> >>
> >>   |
> >>
> >>   |
> >>
> >>
> >> But it's very low level and I can't find any doc more "user friendly"
> >>
> >>     Il martedì 16 aprile 2019, 09:00:08 CEST, Leonardo Francalanci 
> >> <leoonar...@yahoo.it.INVALID> ha scritto:
> >>
> >>   Hi,
> >>
> >> I'm using Solr 8.0.0  I can't get autoGeneratePhraseQueries to work (also 
> >> tried with 7.7.1 and same result):
> >>
> >> debug":{
> >>     "rawquerystring":"TROUBLESHOOT:my25word",
> >>     "querystring":"TROUBLESHOOT:my25word",
> >>     "parsedquery":"TROUBLESHOOT:my TROUBLESHOOT:25 TROUBLESHOOT:word",
> >>     "parsedquery_toString":"TROUBLESHOOT:my TROUBLESHOOT:25 
> >> TROUBLESHOOT:word",
> >>
> >> I expected something like
> >>
> >> "parsedquery":"TROUBLESHOOT:"my 25 word"
> >> Why isn't autoGeneratePhraseQueries generating a quoted string argument 
> >> when I query???
> >>
> >>
> >> This is my configuration:
> >>
> >>       <dynamicField name="*_txt_en_split" type="text_en_splitting"  
> >> indexed="true"  stored="true"/>
> >>     <fieldType name="text_en_splitting" class="solr.TextField" 
> >> positionIncrementGap="100" autoGeneratePhraseQueries="true">
> >>       <analyzer type="index">
> >>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >>         <!-- in this example, we will only use synonyms at query time
> >>         <filter class="solr.SynonymGraphFilterFactory" 
> >> synonyms="index_synonyms.txt" ignoreCase="true" expand="false"/>
> >>         -->
> >>         <!-- Case insensitive stop word removal.
> >>         -->
> >>         <filter class="solr.StopFilterFactory"
> >>                 ignoreCase="true"
> >>                 words="lang/stopwords_en.txt"
> >>         />
> >>         <filter class="solr.WordDelimiterGraphFilterFactory" 
> >> generateWordParts="1" generateNumberParts="1" catenateWords="1" 
> >> catenateNumbers="1" catenateAll="0" splitOnCaseChange="1"/>
> >>         <filter class="solr.LowerCaseFilterFactory"/>
> >>         <filter class="solr.KeywordMarkerFilterFactory" 
> >> protected="protwords.txt"/>
> >>         <filter class="solr.PorterStemFilterFactory"/>
> >>         <filter class="solr.FlattenGraphFilterFactory" />
> >>       </analyzer>
> >>       <analyzer type="query">
> >>         <tokenizer class="solr.WhitespaceTokenizerFactory"/>
> >>         <filter class="solr.SynonymGraphFilterFactory" 
> >> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
> >>         <filter class="solr.StopFilterFactory"
> >>                 ignoreCase="true"
> >>                 words="lang/stopwords_en.txt"
> >>         />
> >>         <filter class="solr.WordDelimiterGraphFilterFactory" 
> >> generateWordParts="1" generateNumberParts="1" catenateWords="0" 
> >> catenateNumbers="0" catenateAll="0" splitOnCaseChange="1"/>
> >>         <filter class="solr.LowerCaseFilterFactory"/>
> >>         <filter class="solr.KeywordMarkerFilterFactory" 
> >> protected="protwords.txt"/>
> >>         <filter class="solr.PorterStemFilterFactory"/>
> >>       </analyzer>
> >>     </fieldType>
> >> <field name="TROUBLESHOOT" type="text_en_splitting"  indexed="true" 
> >> stored="true" multiValued="true" omitNorms="true"/>
> >>
>

Re: autoGeneratePhraseQueries not working

Reply via email to