Polish Stemmer
Hi, I’m developing a multi language Solr index, where I have a single core for each one. I use SnowballPorterFilterFactory for German, French and Italian languages with excellent results. My problem appears when I try to create a Polish stemmed index. There isn’t a Snowball implementation for Polish, but I found a lucene one: http://www.getopt.org/stempel/index.html#distrib I included the jar into Solr lib folder and included the filter into the appropriate fieldtype () but when I run the server this error appears: GRAVE: *org.apache.solr.common.SolrException*: Error instantiating class: 'org.getopt.stempel.lucene.StempelFilter' Has anybody found this error before? Other solutions for Polish Stemming would be great too. Thanks in advance.
Re: Polish Stemmer
Thanks very much! I suppose I’m still very dummy in Solr, I was supposting I could do it directly. I did what you said and it seems to work perfectly! *public* *class* PolishStemFilterFactory *extends* BaseTokenFilterFactory { *public* StempelFilter create(TokenStream in) { *return* *new* StempelFilter(in); } } Thank you very much Shalin! 2009/9/2 Shalin Shekhar Mangar > On Wed, Sep 2, 2009 at 8:10 PM, David Espinosa wrote: > > > My problem appears when I try to create a Polish stemmed index. There > isn’t > > a Snowball implementation for Polish, but I found a lucene one: > > > > http://www.getopt.org/stempel/index.html#distrib > > > > I included the jar into Solr lib folder and included the filter into the > > appropriate fieldtype ( > class="org.getopt.stempel.lucene.StempelFilter" />) but when I run the > > server this error appears: > > > > GRAVE: *org.apache.solr.common.SolrException*: Error instantiating class: > > 'org.getopt.stempel.lucene.StempelFilter' > > > > You'd need to create a factory class which implements Solr's > TokenFilterFactory or extends BaseTokenFilterFactory which creates the > StempelFilter. Then specify the factory class in schema.xml > > -- > Regards, > Shalin Shekhar Mangar. >
Highlighting in stemmed or n-grammed fields possible?
Hi, Anybody knows how to get the highlighted field, when q term matches in a stemmed or n-grammed filtered field? Matching in a normal field (not stemmed or n-grammed) highlighting works perfectly as expected. But in stemmed matching cases, no highlighting fields are recovered, and in n-gramming matching highlighting field is recovered but in a bad order (example: if q=”solr” matches with “here is solr” results to “here is solr”). All fields are stored (and indexed as well….). Thanks in advance.
Highlighting in stemmed or n-grammed fields possible?
Hi, Anybody knows how to get the highlighted field, when q term matches in a stemmed or n-grammed filtered field? Matching in a normal field (not stemmed or n-grammed) highlighting works perfectly as expected. But in stemmed matching cases, no highlighting fields are recovered, and in n-gramming matching highlighting field is recovered but in a bad order (example: if q=”solr” matches with “here is solr” results to “here is solr”). All fields are stored (and indexed as well….). Thanks in advance.