Re: How to use stopwords, synonyms along with fuzzy match in a SOLR

Erick Erickson Wed, 08 May 2019 22:14:35 -0700

Well, I’d start by adding debug=true, that’ll show you the parsed query as well 
as why certain documents scored the way they did. But do note that q=junk~ will 
search against the default text field (the ”df” parameter in the request 
handler definition in solrconfig.xml). Is that what you’re expecting?


Or, I suppose, it’s searching against the fields defined if you’re using 
(e)dismax as your query parser. But the debut output (parsed query part) will 
show what the actual search is.

You should also look at the admin/analysis page. For instance, the way you have 
the field defined at index time, it’ll break on whitespace. But “junk.” won’t 
be found because your stopword doesn’t contain the period.

Plus, your EdgeNGramFilterFactory is pretty strange. A min gram size of 1 means 
you’re searching for single characters.

So what I’d do is back off the definition and build it up bit by bit to see 
if/when you have this problem. But if stopwords are working correctly at index 
time, the “junk” will not be _in_ the index, therefore it’ll be impossible to 
find fuzzy search or not. So you’re making some assumptions that aren’t true, 
and the analysis process combined with looking at the parsed query should show 
you quite a lot.

Best,
Erick

> On May 8, 2019, at 4:43 PM, bbarani <bbar...@gmail.com> wrote:
> 
> Hi,
> Is there a way to use stopwords and fuzzy match in a SOLR query?
> 
> The below query matches 'jack' too and I added 'junk' to the stopwords (in
> query) to avoid returning results but looks like its not honoring the
> stopwords when using the fuzzy search. 
> 
> solr/collection1/select?app-qf=title_autoComplete&hl=false&fl=*&group=true&group.limit=-1&group.sort=marketingSequence%20asc&group.field=productId&group.ngroups=true&facet=on&facet.field=categoryFilter&sort=defaultMarketingSequence%20asc&q=junk~
> 
> 
>    <fieldType name="edgytext" class="solr.TextField">
>        <analyzer type="index">
>            <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>            <filter class="solr.LowerCaseFilterFactory"/>
>            <filter class="solr.PorterStemFilterFactory"/>
>            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>            <filter class="solr.SynonymFilterFactory" ignoreCase="true"
> synonyms="synonyms.txt"/>
>            <filter class="solr.WordDelimiterFilterFactory"
> catenateNumbers="0" generateNumberParts="0" generateWordParts="0"
> preserveOriginal="1" catenateAll="0" catenateWords="1"/>
>            <filter class="solr.EdgeNGramFilterFactory" maxGramSize="50"
> minGramSize="1"/>
>        </analyzer>
>        <analyzer type="query">
>            <filter class="solr.StopFilterFactory" words="stopwords.txt"
> ignoreCase="true"/>
>            <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>            <filter class="solr.LowerCaseFilterFactory"/>
>            <filter class="solr.PorterStemFilterFactory"/>
>            <filter class="solr.RemoveDuplicatesTokenFilterFactory"/>
>            <filter class="solr.SynonymFilterFactory" ignoreCase="true"
> synonyms="synonyms.txt"/>
>            <filter class="solr.WordDelimiterFilterFactory"
> catenateNumbers="0" generateNumberParts="0" generateWordParts="0"
> preserveOriginal="1" catenateAll="0" catenateWords="1"/>
>        </analyzer>
>    </fieldType>
> 
> 
> 
> --
> Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html

Re: How to use stopwords, synonyms along with fuzzy match in a SOLR

Reply via email to