Re: can solr automatically search for different punctuation of a word

Chantal Ackermann Wed, 01 Feb 2012 06:07:15 -0800

Hi Alex,

the <dependency> tag is used in the Maven project file (pom.xml). If you
are not using Maven to build your project then simply skip that part.


The important thing is that the ICU jar (lucene-icu) and the analysis
extra jar (solr-analysis-extra) are in your classpath.

See also Erick's answer in respond to your question. The folder for
additional jar files in solr is:

${SOLR_HOME}/lib/

Cheers,
Chantal

On Tue, 2012-01-31 at 04:38 +0100, alx...@aim.com wrote:
> Hi Chantal,
> 
> In the readme file at  solr/contrib/analysis-extras/README.txt it says to add 
> the ICU library (in lib/)
> 
> Do I need also add <dependecy>... and where?
> 
> Thanks.
> Alex.
> 
>  
> 
> 
> 
> -----Original Message-----
> From: Chantal Ackermann <chantal.ackerm...@btelligent.de>
> To: solr-user <solr-user@lucene.apache.org>
> Sent: Fri, Jan 13, 2012 1:52 am
> Subject: Re: can solr automatically search for different punctuation of a word
> 
> 
> Hi Alex,
> 
> 
> 
> for me, ICUFoldingFilterFactory works very good. It does lowercasing and
> 
> removes diacritica (this is how umlauts and accenting of letters is
> 
> called - punctuation means comma, points etc.). It will work for any any
> 
> language, not only German. And it will also handle apostrophs as in
> 
> "C'est bien".
> 
> 
> 
> ICU requires additional libraries in the classpath. For an in-built solr
> 
> solution have a look at ASCIIFoldingFilterFactory.
> 
> 
> 
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ASCIIFoldingFilterFactory
> 
> http://wiki.apache.org/solr/AnalyzersTokenizersTokenFilters#solr.ICUFoldingFilterFactory
> 
> 
> 
> 
> 
> 
> 
> Example configuration:
> 
> <fieldType name="text_sort" class="solr.TextField"
> 
>       positionIncrementGap="100">
> 
>       <analyzer>
> 
>               <tokenizer class="solr.KeywordTokenizerFactory" />
> 
>               <filter class="solr.ICUFoldingFilterFactory" />
> 
>       </analyzer>
> 
> </fieldType>
> 
> 
> 
> And dependencies (example for Maven) in addition to solr-core:
> 
> <dependency>
> 
>       <groupId>org.apache.lucene</groupId>
> 
>       <artifactId>lucene-icu</artifactId>
> 
>       <version>${solr.version}</version>
> 
>       <scope>runtime</scope>
> 
> </dependency>
> 
> <dependency>
> 
>       <groupId>org.apache.solr</groupId>
> 
>       <artifactId>solr-analysis-extras</artifactId>
> 
>       <version>${solr.version}</version>
> 
>       <scope>runtime</scope>
> 
> </dependency>
> 
> 
> 
> Cheers,
> 
> Chantal
> 
> 
> 
> On Fri, 2012-01-13 at 00:09 +0100, alx...@aim.com wrote:
> 
> > Hello,
> 
> > 
> 
> > I would like to know if solr has a functionality to automatically search 
> > for a 
> 
> different punctuation of a word. 
> 
> > For example if I if a user searches for a word Uber, and stemmer is german 
> 
> lang, then solr looks for both Uber and  Über,  like in synonyms.
> 
> > 
> 
> > Is it possible to give a file with a list of possible substitutions of 
> > letters 
> 
> to solr and have it search for all possible punctuations?
> 
> > 
> 
> > 
> 
> > Thanks.
> 
> > Alex.
> 
> 
> 
> 
>

Re: can solr automatically search for different punctuation of a word

Reply via email to