Stempel (http://www.getopt.org/stempel/) provides Lucene
implementation of algorythmic stemmer for Polish language. All you
have to do is to implement appropriate factory for Solr, as described
in http://www.ibm.com/developerworks/library/j-solr2/index.html#analyzers
.
Wiadomość napisana w dniu 2008-09-12, o godz. 11:26, przez sunnyfr:
Hi everybody,
I'm working now on solr implementation for a multi-language website.
I've found a lot of language managed by solr like, japon, greek .....
spanish ..
But I didn't found anything about Polish/Turkish. Can you help me
please ?
<fieldType name="text_ja" class="solr.TextField">
<tokenizer class="org.apache.lucene.analysis.cjk.CJKTokenizer" />
<analyzer class="org.apache.lucene.analysis.cjk.CJKAnalyzer"/>
</fieldType>
OR
<fieldtype name="text_es" class="solr.TextField">
<analyzer>
<tokenizer class="solr.StandardTokenizerFactory"/>
<filter class="solr.StandardFilterFactory"/>
<filter class="solr.ISOLatin1AccentFilterFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory"
language="Spanish"
/>
</analyzer>
</fieldtype>
Thanks for your help.
I'm interested as well about Turkish language too.
Wish you a nice day,
Sunny
--
View this message in context:
http://www.nabble.com/Polish-Turkish-stemming-schema.xml-Click-to-flag-this-post-tp19452498p19452498.html
Sent from the Solr - User mailing list archive at Nabble.com.
--
We read Knuth so you don't have to. - Tim Peters
Jarek Zgoda, R&D, Redefine
[EMAIL PROTECTED]