Re: Searching w/explicit Multi-Word Synonym Expansion

Jack Krupansky Wed, 17 Jul 2013 07:32:45 -0700

To the best of my knowledge, there is no patch or collection of patcheswhich constitutes a "working solution" - just partial solutions.

Yes, it is true, there is some FST work underway (active??) that showspromise depending on query parser implementation, but again, this is all alonger-term future, not a "here and now". Maybe in the 5.0 timeframe?

I don't want anyone to get the impression that there are off-the-shelfpatches that completely solve the synonym phrase problem. Yes, progress isbeing made, but we're not there yet.


-- Jack Krupansky

-----Original Message-----From: Roman Chyla

Sent: Wednesday, July 17, 2013 9:58 AM
To: solr-user@lucene.apache.org
Subject: Re: Searching w/explicit Multi-Word Synonym Expansion

Hi all,

What I find very 'sad' is that Lucene/SOLR contain all the necessary
components for handling multi-token synonyms; the Finite State Automaton
works perfectly for matching these items; the biggest problem is IMO the
old query parser which split things on spaces and doesn't know to be
smarter.

THIS IS A LONG-TIME PROBLEM - THERE EXIST SEVERAL WORKING SOLUTIONS (but
none was committed...sigh, we are re-inventing wheel all the time...)

LUCENE-1622
LUCENE-4381
LUCENE-4499

The problem of synonym expansion is more difficult becuase of the parsing -
the default parsers are not flexible and they split on empty space -
recently I have proposed a solution which makes also the multi-token
synonym expansion simple

this is the ticket:
https://issues.apache.org/jira/browse/LUCENE-5014

that query parser is able to split on spaces, then look back, do the second
pass to see whether to expand with synonyms - and even discover different
parse paths and construct different queries based on that. if you want to
see some complex examples, look at:
https://github.com/romanchyla/montysolr/blob/master/contrib/adsabs/src/test/org/apache/solr/analysis/TestAdsabsTypeFulltextParsing.java
-
eg. line 373, 483

Lucene/SOLR developers are already doing great work and have much to do -
they need help from everybody who is able to apply patch, test it and
report back to JIRA.

roman

On Wed, Jul 17, 2013 at 9:37 AM, dmarini <david.marini...@gmail.com> wrote:

iorixxx,

Thanks for pointing me in the direction of the QueryElevation component.If

it did not require that the target documents be keyed by the unique key
field it would be ideal, but since our Sku field is not the Unique field
(we

have an internal id which serves as the key while this is the client'skey)

it doesn't seem like it will match unless I make a larger scope change.

Jack,

I agree that out of the box there hasn't been a generalized solution for
this yet. I guess what I'm looking for is confirmation that I've gone as
far

as I can properly and from this point need to consider using somethinglike

the HON custom query parser component (which we're leery of using because
from my reading it solves a specific scenario that may overcompensate what
we're attempting to fix). I would personally rather stay IN solr than add
custom .jar files from around the web if at all possible.

Thanks for the replies.

--Dave





--
View this message in context:
http://lucene.472066.n3.nabble.com/Searching-w-explicit-Multi-Word-Synonym-Expansion-tp4078469p4078610.html
Sent from the Solr - User mailing list archive at Nabble.com.

Re: Searching w/explicit Multi-Word Synonym Expansion

Reply via email to