Re: Configuring a custom Analyzer for the SynonymFilter

Raf Fri, 30 Sep 2016 08:13:04 -0700

Just to bring up to a conclusion, I have finally solved my issue by
creating a custom Analyzer for use with the SynonymFilter.
It is not as "declarative" as I would have hoped, but at least it works :)


Greetings,
*Raf*

On Wed, Sep 28, 2016 at 9:26 AM, Raf <r.ventag...@gmail.com> wrote:

> On Wed, Sep 28, 2016 at 3:21 AM, Alexandre Rafalovitch <arafa...@gmail.com
> > wrote:
>
>> Before you go down this rabbit hole, are you actually sure this does
>> what you think it does?
>>
>> As far as I can tell, that parameter is for analyzing/parsing the
>> synonym entries in the synonym file. Not the incoming search queries
>> or text actually being indexed.
>
>
>
> Yes, this is exactly what I am looking for.
>
> I have already customized my indexing and query analyzer for that field,
> by using a custom filter that performs lemmatization for the Italian
> language.
> Hence, the token I have in my index (or in the parsed query) are something
> like evento_n (event -> noun) or mangiare_v (eat -> verb).
>
> Now I would like to define synonyms without having to know the "lemma"
> form.
>
> For example, I would like to have in my synonyms file:
> evento,festa,spettacolo
> and make the *SynonymFilter* analyzer transform them in
> *evento_n,festa_n,spettacolo_n*
>
> This way, a query like *myField:spettacoli* (the plural form of
> *spettacolo*) would be analyzed as *myField:(spettacolo_n evento_n
> festa_n)*.
>
>
>
>> Did you get it to work with the simpler configuration?
>>
>
> Yes, I carried out an experiment using the standard Lucene ItalianAnalyzer
> class (both at indexing and query time and for the SynonymFilter) and it
> works the way I was expecting. Unfortunately I cannot use this analyzer
> because I have to apply my custom lemmatization filter.
>
> Therefore, I am confident I can achieve my desired result by defining a
> custom Analyzer class, but I would have preferred to be able to alter the
> filter chain just modifying the *schema.xml* file.
>
> Is there an alternative way to achieve the same result I am not seeing?
>
> Thank you very much for your help.
>
>
> Bye,
> *Raf*
>
>
>
>>
>> Just double checking.
>>
>> Regards,
>>    Alex.
>> ----
>> Newsletter and resources for Solr beginners and intermediates:
>> http://www.solr-start.com/
>>
>>
>> On 27 September 2016 at 22:45, Raf <r.ventag...@gmail.com> wrote:
>> > On Tue, Sep 27, 2016 at 4:22 PM, Alexandre Rafalovitch <
>> arafa...@gmail.com>
>> > wrote:
>> >
>> >> Looking at the code (on GitHub is easiest), it can take either
>> >> analyzer or tokenizer but definitely not any chain definitions. This
>> >> seems to be the same all the way to 6.2.1.
>> >>
>> >
>> > Thanks for your answer Alex.
>> >
>> > Does anyone know if it exists a viable alternative to make it
>> configurable
>> > inside the schema.xml instead of defining a custom Java class?
>> >
>> > I was thinking about something like:
>> >
>> > * defining the *analyzer* outside of the *field* element, giving it a
>> name:
>> > <analyzer name="myAnalyzer">
>> >    <tokenizer class="MyTokenizer" />
>> >    <filter class="solr.LowerCaseFilterFactory"/>
>> >    <filter class="MyFilter_1" />
>> > </analyzer>
>> >
>> > * referring to it inside the *SynonymFilter* definition by its name:
>> > <filter class="solr.SynonymFilterFactory" synonyms="synonyms.txt"
>> > ignoreCase="true" expand="true" analyzer="myAnalyzer"/>
>> >
>> > Unfortunately I have not found anything like this inside the Solr
>> > documentation.
>> > Is it possible to achieve something like that or the only solution is
>> > writing a custom Java class for each combination filter I need to use
>> for
>> > synonyms analysis?
>> >
>> > Thanks.
>> >
>> > Bye,
>> > *Raffaella*
>> >
>> >
>> > ----
>> >> Newsletter and resources for Solr beginners and intermediates:
>> >> http://www.solr-start.com/
>> >>
>> >>
>> >> On 27 September 2016 at 21:10, Raf <r.ventag...@gmail.com> wrote:
>> >> > Hi,
>> >> > is it possible to configure a custom analysis for synonyms the same
>> way
>> >> we
>> >> > do for index/query field analysis?
>> >> >
>> >> > Reading the *SynonymFilter* documentation[0], I have found I can
>> specify
>> >> a
>> >> > custom analyzer by writing its class name.
>> >> >
>> >> > Example:
>> >> > <fieldType name="myField_it" class="solr.TextField" >
>> >> >       <analyzer>
>> >> >         <tokenizer class="MyTokenizer" />
>> >> >         <filter class="solr.LowerCaseFilterFactory"/>
>> >> >         <filter class="MyFilter_1" />
>> >> >         <filter class="MyFilter_2" />
>> >> >         <filter class="solr.SynonymFilterFactory"
>> >> synonyms="synonyms.txt"
>> >> > ignoreCase="true" expand="true"
>> >> > analyzer="org.apache.lucene.analysis.it.ItalianAnalyzer"/>
>> >> >       </analyzer>
>> >> >     </fieldType>
>> >> >
>> >> >
>> >> > What I would like to achieve, instead, it is something like this:
>> >> > <fieldType name="myField_it" class="solr.TextField">
>> >> >       <analyzer>
>> >> >         <tokenizer class="MyTokenizer" />
>> >> >         <filter class="solr.LowerCaseFilterFactory"/>
>> >> >         <filter class="MyFilter_1" />
>> >> >         <filter class="MyFilter_2" />
>> >> >         <filter class="solr.SynonymFilterFactory"
>> >> synonyms="synonyms.txt"
>> >> > ignoreCase="true" expand="true">
>> >> >   <analyzer>
>> >> >               <tokenizer class="MyTokenizer" />
>> >> >               <filter class="solr.LowerCaseFilterFactory"/>
>> >> >               <filter class="MyFilter_1" />
>> >> >             </analyzer>
>> >> > </filter>
>> >> >       </analyzer>
>> >> >     </fieldType>
>> >> >
>> >> >
>> >> > I have tried to configure it this way, but it does not work.
>> >> > I do not get any configuration error, but the custom analyzer is not
>> >> > applied to synonyms.
>> >> >
>> >> > Is it possible to achieve this result by configuration or am I
>> forced to
>> >> > write a custom Analyzer class?
>> >> >
>> >> > I am currently using Solr 5.2.1.
>> >> > At the moment I cannot upgrade to a newer version.
>> >> >
>> >> >
>> >> > Thank you very much for any help you can provide.
>> >> >
>> >> > Regards,
>> >> > *Raf*
>> >> >
>> >> >
>> >> > [0]
>> >> > http://archive.apache.org/dist/lucene/solr/ref-guide/
>> >> apache-solr-ref-guide-5.2.pdf
>> >> >   p. 132
>> >>
>>
>
>

Re: Configuring a custom Analyzer for the SynonymFilter

Reply via email to