Stempel (http://www.getopt.org/stempel/) provides Lucene implementation of algorythmic stemmer for Polish language. All you have to do is to implement appropriate factory for Solr, as described in http://www.ibm.com/developerworks/library/j-solr2/index.html#analyzers .

Wiadomość napisana w dniu 2008-09-12, o godz. 11:26, przez sunnyfr:


Hi everybody,

I'm working now on solr implementation for a multi-language website.
I've found a lot of language managed by solr like, japon, greek .....
spanish ..
But I didn't found anything about Polish/Turkish. Can you help me please ?

   <fieldType name="text_ja" class="solr.TextField">
     <tokenizer class="org.apache.lucene.analysis.cjk.CJKTokenizer" />
     <analyzer class="org.apache.lucene.analysis.cjk.CJKAnalyzer"/>
   </fieldType>
OR
   <fieldtype name="text_es" class="solr.TextField">
    <analyzer>
       <tokenizer class="solr.StandardTokenizerFactory"/>
       <filter class="solr.StandardFilterFactory"/>
       <filter class="solr.ISOLatin1AccentFilterFactory"/>
       <filter class="solr.LowerCaseFilterFactory"/>
<filter class="solr.SnowballPorterFilterFactory" language="Spanish"
/>
     </analyzer>
   </fieldtype>

Thanks for your help.

I'm interested as well about Turkish language too.

Wish you a nice day,
Sunny
--
View this message in context: 
http://www.nabble.com/Polish-Turkish-stemming-schema.xml-Click-to-flag-this-post-tp19452498p19452498.html
Sent from the Solr - User mailing list archive at Nabble.com.


--
We read Knuth so you don't have to. - Tim Peters

Jarek Zgoda, R&D, Redefine
[EMAIL PROTECTED]

Reply via email to