Hi Brad,

I was trying this, too, and there is a possibility how to get multi-term
synonyms to work properly. I wrote my solution already on this list.

My solution was as follows:

[cite]
after your hints that had partially confirmed my considerations, I had
made some tests with the FieldQParser. At the beginning, I had have some
problems, but finally, I was able to solve the problem of multi-word
synonyms at query time in a way that is suitable for us - and possibly
for others, too.

At my solution, I re-used the FieldQParserPlugin. At first, I ported it
to the new API (incrementToken instead of next, etc.) and then I
modified the code so, that no PhraseQueries will be created but only
BooleanQueries.

Now with my new QParserPlugin that based on the FieldQParserPlugin, it's
possible to search for things like "foo bar baz", where "foo bar" has to
be changed to "foo_bar" and where at the end the tokens "foo_bar" und
"baz" will be created, so that both could match independently.
[/cite]

Our current version is re-worked again, so that also multi-field queries
are possible.

If you want to use such a solution, you have probably to go without
complex query parsing et cetera. I also have to write your own modified
QParser, that fit your special needs. Also some higher features, like
they are offered by other QParsers could be integrated. It's all up to
you and your needs.



Patrick



brad anderson schrieb:
> Thanks for the help. Can't believe I missed that part in the wiki.
> 
> 2009/11/24 Tom Hill <solr-l...@worldware.com>
> 
>> Hi Brad,
>>
>>
>> I suspect that this section from the wiki for SynonymFilterFactory might be
>> relevant:
>>
>>
>> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory
>>
>> *"Keep in mind that while the SynonymFilter will happily work with synonyms
>> containing multiple words (ie: "**sea biscuit, sea biscit, seabiscuit**")
>> The recommended approach for dealing with synonyms like this, is to expand
>> the synonym when indexing. This is because there are two potential issues
>> that can arrise at query time:*
>>
>>   1.
>>
>>   *The Lucene QueryParser tokenizes on white space before giving any text
>>   to the Analyzer, so if a person searches for the words **sea biscit** the
>>   analyzer will be given the words "sea" and "biscit" seperately, and will
>> not
>>   know that they match a synonym."*
>>
>>   ...
>>
>> Tom
>>
>> On Tue, Nov 24, 2009 at 10:47 AM, brad anderson <solrinter...@gmail.com
>>> wrote:
>>> Hi Folks,
>>>
>>> I was trying to get multi term synonyms to work. I'm experiencing some
>>> strange behavior and would like some feedback.
>>>
>>> In the synonyms file I have the line:
>>>
>>>     thomas, boll holly, thomas a, john q => tom
>>>
>>> And I have a document with the text field as;
>>>
>>>     tom
>>>
>>> However, when I do a search on boll holly, it does not return the
>> document
>>> with tom. The same thing happens if I do a query on john q. But if I do a
>>> query on thomas, it gives me the document. Also, if I quote "boll holly"
>> or
>>> "john q" it gives back the document.
>>>
>>> When I look at the analyzer page on the solr admin page, it is
>> transforming
>>> "boll holly" to "tom" when it isn't quoted. Why is it that it is not
>>> returning the document? Is there some configuration I can make so it does
>>> return the document if I do an unquoted search on "boll holly"?
>>>
>>> My synonym filter is defined as follows, and is only defined on the query
>>> side:
>>>
>>> <filter class="solr.SynonymFilterFactory"
>>> synonyms="synonyms.txt" ignoreCase="true" expand="true"/>
>>>
>>>
>>> I've also tried changing the synonym file to be
>>>
>>> tom, thomas, boll holly, thomas a, john q
>>>
>>> This produces the same results.
>>>
>>> Thanks,
>>> Brad
>>>
> 

Reply via email to