Just escape the special characters of the URL with backslash or put the entire URL in quotes. The slash is particularly problematic since it introduces a regular expression. Dismax has a less-sophisticated syntax and automatically escapes more special characters.

-- Jack Krupansky

-----Original Message----- From: heaven
Sent: Sunday, August 11, 2013 8:53 AM
To: solr-user@lucene.apache.org
Subject: Edismax vs Dismax

Hi, the application I am working on switched to edismax parser and I found
some weird behavior.

I have this field:
<fieldType name="url" class="solr.TextField" omitNorms="false">
     <analyzer type="index">
       <tokenizer class="solr.PatternTokenizerFactory" pattern="\b" />
       <filter class="solr.StopFilterFactory" words="url_stopwords.txt"
ignoreCase="true"/>
       <filter class="solr.EdgeNGramFilterFactory" minGramSize="1"
maxGramSize="20" />
       <filter class="solr.LowerCaseFilterFactory" />
     </analyzer>
     <analyzer type="query">
       <tokenizer class="solr.PatternTokenizerFactory" pattern="\b" />
       <filter class="solr.StopFilterFactory" words="url_stopwords.txt"
ignoreCase="true"/>
       <filter class="solr.LowerCaseFilterFactory" />
     </analyzer>
   </fieldType>

The string that is indexed is: facebook.com/profile.php?id=123456789

When I do use the dismax parser the query returns one result and 0 with
edismax. Here are the queries I tried:
1 result:
fq=type%3ASite&sort=score+desc&q=facebook.com%2Fprofile.php%3Fid%3D1571031169&fl=%2A+score&qf=url_url&defType=dismax&mm=1&start=0&rows=20&

0 results:
fq=type%3ASite&sort=score+desc&q=facebook.com%2Fprofile.php%3Fid%3D1571031169&fl=%2A+score&qf=url_url&defType=edismax&mm=1&start=0&rows=20&

Can someone please help me figure this out?

Thank you,
Alex



--
View this message in context: http://lucene.472066.n3.nabble.com/Edismax-vs-Dismax-tp4083812.html Sent from the Solr - User mailing list archive at Nabble.com.

Reply via email to