I think you need the feature in SOLR-629 that adds fuzzy to edismax. 

https://issues.apache.org/jira/browse/SOLR-629

The patch on that issue is for Solr 4.x, but I believe someone is working on a 
new patch.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Feb 7, 2018, at 2:10 AM, Emir Arnautović <emir.arnauto...@sematext.com> 
> wrote:
> 
> Hi Sravan,
> Edismax has ’sow’ parameter that results in edismax to pass query to field 
> analysis, but not sure how it will work with fuzzy search. What you might do 
> is use _query synthax to separate shingle and non shingle queries, e.g.
> q=_query({!edismax sow=false qf=title_bigrams}$v) OR _query({!edismax 
> qf=title}$v)&$v=some movie title
> 
> HTH,
> Emir
> --
> Monitoring - Log Management - Alerting - Anomaly Detection
> Solr & Elasticsearch Consulting Support Training - http://sematext.com/
> 
> 
> 
>> On 7 Feb 2018, at 10:55, Sravan Kumar <sra...@caavo.com> wrote:
>> 
>> We have the following two fields for our movie title search
>> - title without symbols
>> a custom analyser with WordDelimiterFilterFactory, SynonymFilterFactory and
>> other filters to retain only alpha numeric characters.
>> - title with word bi grams
>> a custom analyser with solr.ShingleFilterFactory to generate "bi gram" word
>> tokens with '_' as separator.
>> 
>> A custom similarity class is used to make tf & idf values as 1.
>> 
>> Edismax query parser is used to perform all searches. Phrase boosting (pf)
>> is also used.
>> 
>> There are couple of issues while searching:
>> 1>  BiGram field doesn't generate bi grams if the white spaces in the query
>> are not escaped.
>> - For example, if the query is "pursuit of happyness", then bi grams are
>> not generated.  This is due to the fact that the edismax query parser
>> tokenizes based on whitespaces before passing the string to
>> analyser(correct me if I am wrong).
>> But in case of "pursuit\ of\ happyness", they are as the string which is
>> passed to the analyser is with the whitespace.
>> 
>> 2>  Fuzzy search doesn't work in  whitespace escaped queries.
>> Ex: "pursuit~2\ of\ happiness~1"
>> 
>> 3> Edismax's Phrase boosting doesn't work the way it should in
>> non-whitespace escaped fuzzy queries.
>> 
>> If the query is "pursuit~2 of happiness~1" (without escaping whitespaces)
>> 
>> fuzzy queries are generated
>> (title_name:pursuit~2), (title_name:happiness~1) in the parsed query.
>> But,edismax pf (phrase boost) generates query like
>> title_name:"pursuit (2 pursuit2) of happiness (1 happiness1)"
>> This means the analyser got the original query consisting the fuzzy
>> operator for phrase boosting.
>> 
>> 
>> 1> How whitespaces should be handled in case of filters like
>> solr.ShingleFilterFactory to generate bi grams?
>> 2> If generating bi grams requires whitespaces escaped and fuzzy searches
>> not, how do we accomodate both these in a single solr request and scored
>> together.
>> 
>> 
>> 
>> -
>> -- 
>> Regards,
>> Sravan
> 

Reply via email to