Wow, yes that works!

So the problem is in the ExtendedDismaxQParser because it can't
handle synonym expansion?

From the analysis of SynonymFilterFactory I can see that the type
is correctly set to SYNONYM.

But is that correct that endOffset has for all synonyms the same value?
It marks the end of the original word and not the end of the replacing synonym?


Regards
Bernd


Am 06.10.2011 16:13, schrieb Ahmet Arslan:
What happens when you switch to lucene query parser?
E.g. When you add&defType=lucene to your search URL?

--- On Thu, 10/6/11, Bernd Fehling<bernd.fehl...@uni-bielefeld.de>  wrote:

From: Bernd Fehling<bernd.fehl...@uni-bielefeld.de>
Subject: Re: query synonym expansion howto?
To: solr-user@lucene.apache.org
Date: Thursday, October 6, 2011, 4:41 PM
OK, I have changed my
synonyms_test.txt:
philosophie, philosophy, filosofia

So there are no multi-word synonyms but it is still not
working.
And also if setting qs=0 I get a query slop.


search for "philosophie" -->  13 hits
search for "philosophy"  -->  21 hits
search for "filosofia"   -->  51 hits

search for "philosophy" with synonym expansion -->  0
hits.

<str name="q">textth:philosophy</str>
</lst>
</lst>
<result name="response" numFound="0" start="0"
maxScore="0.0"/>
−
<lst name="debug">
<str
name="rawquerystring">textth:philosophy</str>
<str
name="querystring">textth:philosophy</str>
−
<str name="parsedquery">
+((textth:philosophie textth:philosophy
textth:filosofia)~3)
</str>
−
<str name="parsedquery_toString">
+((textth:philosophie textth:philosophy
textth:filosofia)~3)
</str>
<lst name="explain"/>
<str
name="QParser">ExtendedDismaxQParser</str>


org.apache.solr.analysis.SynonymFilterFactory
{tokenizerFactory=solr.WhitespaceTokenizerFactory,
synonyms=synonyms_test.txt, expand=true,
format=solr, ignoreCase=true,
luceneMatchVersion=LUCENE_35}
position        1
term text       philosophie

    philosophy

    filosofia
type            SYNONYM

    SYNONYM

    SYNONYM
startOffset     0

    0

    0
endOffset       10

    10

    10


Very strange.
Anything else to try?

Regards
Bernd


Am 06.10.2011 13:58, schrieb Ahmet Arslan:
Query time synonym expansion has problems with
multi-word synonyms.
Query parser splits query string according to
white-spaces before query string reaches to analysis chain.


This is a known limitation explained here :

http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.SynonymFilterFactory

But I think using synonyms at index time has its
problems as well. E.g. You need to re-index if you
add/remove/edit synonym list. For some systems re-indexing
takes a lot of time.

I am wondering if a "query expansion module" that
injects (before analysis chain) synonymy to initial query
string would makes sense.
E.g. If the query string contains 'adult education' it
will add "educación de adultos" phrase as an injected
optional clause.

About query slop, since you are using (e)dismax query
parser, it is controlled via qs parameter.

http://wiki.apache.org/solr/DisMaxQParserPlugin#qs_.28Query_Phrase_Slop.29


has anyone managed to get querytime synonym
expansion
working?

Synonym expansion itself is working but I get no
search
results.

synonyms_test.txt
erwachsenenbildung, adult education, educación de
adultos,
éducation des adultes

search for

"erwachsenenbildung"   -->    8
hits
search for "adult education"
-->   13
hits
search for "educación de adultos"
-->    3 hits

search for "adult education" with synonym
expansion -->
0 hits.

RESULT:
-------
<str name="q">textth:"adult
education"</str>
<str name="q.op">OR</str>

<result name="response" numFound="0" start="0"
maxScore="0.0"/>
−
<lst name="debug">
<str name="rawquerystring">textth:"adult
education"</str>
<str name="querystring">textth:"adult
education"</str>
−
<str name="parsedquery">
+((textth:erwachsenenbildung textth:adult
education
textth:educación de adultos textth:éducation
des
adultes)~4)
</str>
−
<str name="parsedquery_toString">
+((textth:erwachsenenbildung textth:adult
education
textth:educación de adultos textth:éducation
des
adultes)~4)
</str>
<lst name="explain"/>
<str

name="QParser">ExtendedDismaxQParser</str>


Can it be that the "q.op=OR" parameter is
ignored?

Why is the a slop of ~4 added to the parsedquery?

Regards,
Bernd




--
*************************************************************
Bernd Fehling
     Universitätsbibliothek Bielefeld
Dipl.-Inform. (FH)
             Universitätsstr.
25
Tel. +49 521 106-4060
          Fax. +49 521
106-4052
bernd.fehl...@uni-bielefeld.de
               33615
Bielefeld

BASE - Bielefeld Academic Search Engine -
www.base-search.net
*************************************************************


--
*************************************************************
Bernd Fehling                Universitätsbibliothek Bielefeld
Dipl.-Inform. (FH)                        Universitätsstr. 25
Tel. +49 521 106-4060                   Fax. +49 521 106-4052
bernd.fehl...@uni-bielefeld.de                33615 Bielefeld

BASE - Bielefeld Academic Search Engine - www.base-search.net
*************************************************************

Reply via email to