I too have come across this same exact problem. One thing that I have found is that with autoGeneratePhraseQueries=true, you can find the case where your index has 'z score' and your query is z-score, but with false it will not find it. As to your specific problem with the single token zscore in the index and z-score as the query, I'm still stumped. Hopefully someone else can answer this question?
On Wed, Oct 30, 2013 at 11:56 AM, Vardhan Dharnidharka < vardhan1...@hotmail.com> wrote: > Hi, > > The query z-score doesn't match a doc with zscore in the index. The > analysis tool shows that this query would match this data in the index, but > it's the edismax query parser step that seems to screw things up. Is there > some combination of autoGeneratePhraseQueries, WordDelimiterFilterFactory > parameters, and/or something else I can change or add to generically make > the query match without modifying the mm? ie. without adding a rule to > specifically synonymize or split the term "zscore" with some dictionary of > words. > > The query I want to match but doesn't: > z-score > mm=-30% > > In the index: > zscore > > The analyzer: > > <fieldType autoGeneratePhraseQueries="false" class="solr.TextField" > name="lowStopText" positionIncrementGap="100"> > > <analyzer type="index"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter catenateAll="1" catenateNumbers="1" catenateWords="1" > class="solr.WordDelimiterFilterFactory" preserveOriginal="1" > splitOnCaseChange="0" splitOnNumerics="0" types="wdfftypes.txt"/> > <filter class="solr.ICUFoldingFilterFactory"/> > </analyzer> > > <analyzer type="query"> > <tokenizer class="solr.WhitespaceTokenizerFactory"/> > <filter catenateAll="1" catenateNumbers="1" catenateWords="1" > class="solr.WordDelimiterFilterFactory" preserveOriginal="1" > splitOnCaseChange="0" splitOnNumerics="0" types="wdfftypes.txt"/> > <filter class="solr.ICUFoldingFilterFactory"/> > <filter class="solr.StopFilterFactory" enablePositionIncrements="true" > ignoreCase="true" words="stopwords.txt"/> > </analyzer> > </fieldType> > > The parsed edismax query with autoGeneratePhraseQueries=true: > "+(def_term:\"(z-score z) (score zscore)\")" > > The parsed edismax query with autoGeneratePhraseQueries=false: > "+(((def_term:z-score def_term:z def_term:score def_term:zscore)~3))" > > Thanks > Vardhan >