Re: foreign characters equivalent in solr search

2009-02-19 Thread Chris Hostetter
: if a user searches for Tiesto which is indexed in this format Tiësto in our : solr. we want solr also return result This is what the ISOLatin1AccentFilter is for. It's been included in Solr since 1.1. It's been deprecated in favor of the newer ASCIIFoldingFilter which does a better job with

Re: foreign characters equivalent in solr search

2009-02-19 Thread AHMET ARSLAN
> we will try that and post the results here but it seems we > may get problem with highlight function. No highlighting works fine with that. I am also using similar filter for turkish chars. I replace ç with c, ş with s and so on at index time. Another (easier but less efficient ) way to imple

Re: foreign characters equivalent in solr search

2009-02-18 Thread radarghost
it may takes too long for Solr 1.4 any other solution for Solr 1.2? anyway thanks for the reply. Koji Sekiguchi-2 wrote: > > CharFilter will solve the problem, but it comes with Solr 1.4. > > https://issues.apache.org/jira/browse/SOLR-822 > > Koji > > AHMET ARSLAN wrote: >> I think best wa

Re: foreign characters equivalent in solr search

2009-02-18 Thread radarghost
thanks we will try that and post the results here but it seems we may get problem with highlight function. Ahmet Arslan wrote: > > I think best way to do this is to modify > org.apache.lucene.index.memory.SynonymTokenFilter and employ this filter > index time. > > if token.termBuffer() has o

Re: foreign characters equivalent in solr search

2009-02-18 Thread Koji Sekiguchi
CharFilter will solve the problem, but it comes with Solr 1.4. https://issues.apache.org/jira/browse/SOLR-822 Koji AHMET ARSLAN wrote: I think best way to do this is to modify org.apache.lucene.index.memory.SynonymTokenFilter and employ this filter index time. if token.termBuffer() has one

Re: foreign characters equivalent in solr search

2009-02-18 Thread AHMET ARSLAN
I think best way to do this is to modify org.apache.lucene.index.memory.SynonymTokenFilter and employ this filter index time. if token.termBuffer() has one those (á, à, â, ä, ã, å) characters you will replace it with its equvalent ascii character (a). Then you will inject this new Token as a S