Thanks Erick.
In the first place we thought of removing numbers with a pattern filter.
Setting inject to false will have the "same" effect
If we want to be able to search for numbers in the content this solution
will not work,but another field without phonetic filtering and searching in
both fields would be ok,right?

Dirk
Am 07.02.2012 14:01 schrieb "Erick Erickson" <erickerick...@gmail.com>:

> What happens if you do NOT inject? Setting  inject="false"
> stores only the phonetic reduction, not the original text. In that
> case your false match on "13" would go away....
>
> Not sure what that means for the rest of your app though.
>
> Best
> Erick
>
> On Mon, Feb 6, 2012 at 5:44 AM, Dirk Högemann
> <dirk.hoegem...@googlemail.com> wrote:
> > Hi,
> >
> > I have a question on phonetic search and matching in solr.
> > In our application all the content of an article is written to a
> full-text
> > search field, which provides stemming and a phonetic filter (cologne
> > phonetic for german).
> > This is the relevant part of the configuration for the index analyzer
> > (search is analogous):
> >
> >        <tokenizer class="solr.StandardTokenizerFactory"/>
> >        <filter class="solr.WordDelimiterFilterFactory"
> > generateWordParts="1" generateNumberParts="1" catenateWords="0"
> > catenateNumbers="0" catenateAll="0" splitOnCaseChange="0"/>
> >        <filter class="solr.LowerCaseFilterFactory"/>
> >        <filter class="solr.SnowballPorterFilterFactory"
> language="German2"
> > />
> >        <filter class="solr.PhoneticFilterFactory"
> > encoder="ColognePhonetic" inject="true"/>
> >        <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
> >
> > Unfortunately this results sometimes in strange, but also explainable,
> > matches.
> > For example:
> >
> > Content field indexes the following String: Donnerstag von 13 bis 17 Uhr.
> >
> > This results in a match, if we search for "puf"  as the result of the
> > phonetic filter for this is 13.
> > (As a consequence the 13 is then also highlighted)
> >
> > Does anyone has an idea how to handle this in a reasonable way that a
> > search for "puf" does not match 13 in the content?
> >
> > Thanks in advance!
> >
> > Dirk
>

Reply via email to