Hi Matthew,

On 08/18/2008 at 1:39 PM, Matthew Runo wrote:
> <fieldType name="text" class="solr.TextField" positionIncrementGap="100"> 
>   <analyzer> 
> [...]
>     <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>             ignoreCase="true" expand="true"/>
> [...]

I can see from SOLR-702 that most of your synonym rules have a single 
term/phrase on the right-hand side.

The SynonymFilterFactory section of the AnalyzersTokenizersTokenFilters wiki 
page 
<http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#head-2c461ac74b4ddd82e453dc68fcfc92da77358d46>
 says:

   # If expand==true, "ipod, i-pod, i pod" is equivalent to the explicit 
mapping:
   ipod, i-pod, i pod => ipod, i-pod, i pod

AFAICT from looking at the code, however, the "expand" option is ignored when 
there is an explict right-hand side of a rule (i.e. "=> something").

> a, b, c d e, f, g => something

So documents containing "c d e" (or "a" or "b" or "f" or "g") will only be 
indexed with "something".

> I'm having the behavior that searches for a, b, f, and g all
> work, but the c d e does not.

As Otis mentioned earlier in this thread, the above-linked wiki page mentions 
some gotchas about mixing phrases, synonyms, and the Lucene QueryParser.

Perhaps you could address the problem by creating separate rules for your 
phrasal terms, e.g.:

   a, b, f, g => something
   c d e, something

Using the above rule with no right-hand side, and with expand==true, both "c d 
e" and "something" will be indexed for documents containing "c d e".

Steve

Reply via email to