Re: Solr 6.4 new SynonymGraphFilter help for multi-word synonyms

Steve Rowe Thu, 02 Feb 2017 10:02:55 -0800

Hi Cliff,

The Solr query parsers (standard/“Lucene” and e/dismax anyway) have a problem 
that prevents SynonymGraphFilter from working: the text fed to your query 
analyzer is first split on whitespace.  So e.g. a query containing “United 
States” will never match multi-word synonym “United States”->”US”, since the 
analyzer will fist see “United” and then, separately, “States”.


I fixed the whitespace splitting problem in the classic Lucene query parser in 
<https://issues.apache.org/jira/browse/LUCENE-2605>.  (Note that this is *not* 
the same as Solr’s standard/“Lucene” query parser, which is actually a fork of 
Lucene’s query parser with added functionality.)

There is a Solr JIRA I’m working on to fix the whitespace splitting problem: 
<https://issues.apache.org/jira/browse/SOLR-9185>.  I hope to get it committed 
in time for inclusion in Solr 6.5.

--
Steve
www.lucidworks.com

> On Feb 2, 2017, at 9:50 AM, Shawn Heisey <apa...@elyograg.org> wrote:
> 
> On 2/2/2017 7:36 AM, Cliff Dickinson wrote:
>> The SynonymGraphFilter API documentation contains the following statement
>> at the end:
>> 
>> "To get fully correct positional queries when your synonym replacements are
>> multiple tokens, you should instead apply synonyms using this TokenFilter
>> at query time and translate the resulting graph to a TermAutomatonQuery
>> e.g. using TokenStreamToTermAutomatonQuery."
> 
> Lucene is a programming API for search.  That documentation is intended
> for people who are writing Lucene programs.  Those users would be
> constructing query objects in their own code, so they would most likely
> know exactly which object needs to be changed to TermAutomatonQuery.
> 
> Solr is a Lucene program ... and an immensely complicated one.  Many
> Lucene improvements require changes in the end program for full
> support.  I suspect that Solr's capability has not been updated to use
> this new feature in Lucene.  I cannot say for sure, I hope someone who
> is familiar with this Lucene change and Solr internals can comment.
> 
> Thanks,
> Shawn
>

Re: Solr 6.4 new SynonymGraphFilter help for multi-word synonyms

Reply via email to