I don't know of the Stempel jar includes the Java source. At this point I think you should ask the author to Stempel to make a Solr front-end for it. It's very simple for him.

Jakub Godawa wrote:
Am I not doing it in the point no 4? I am compiling all the folder
that was extracted before, but now with that new class file.

2010/11/12 Lance Norskog<goks...@gmail.com>:
I think you have to compile all of the stempel source including your
filter factory into one jar at the same time. Everybody does this; I
don't know how different Java versions make class file binaries.

On Thu, Nov 11, 2010 at 3:06 AM, Jakub Godawa<jakub.god...@gmail.com>  wrote:
Hi! Sorry for such a break, but I was moving house... anyway:

1. I took the 
~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
file and modified it (named as StempelFilterFactory.java) in Vim that
way:

package org.getopt.solr.analysis;

import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.standard.StandardFilter;

public class StempelTokenFilterFactory extends BaseTokenFilterFactory {
  public StempelFilter create(TokenStream input) {
    return new StempelFilter(input);
  }
}

2. Then I put the file to the extracted stempel-1.0.jar in
./org/getopt/solr/analysis/
3. Then I created a class from it: jar -cf
StempelTokenFilterFactory.class StempelFilterFactory.java
4. Then I created new stempel-1.0.jar archive: jar -cf stempel-1.0.jar
-C ./stempel-1.0/ .
5. Then in schema.xml I've put:

    <fieldType name="text_pl" class="solr.TextField">
      <analyzer>
        <tokenizer class="solr.WhitespaceTokenizerFactory"/>
        <filter class="solr.LowerCaseFilterFactory"/>
        <filter class="org.getopt.solr.analysis.StempelTokenFilterFactory" />
      </analyzer>
    </fieldType>

6. I started the solr server and I recieved the following error:

2010-11-11 11:50:56 org.apache.solr.common.SolrException log
SEVERE: java.lang.ClassFormatError: Incompatible magic value
1347093252 in class file
org/getopt/solr/analysis/StempelTokenFilterFactory
        at java.lang.ClassLoader.defineClass1(Native Method)
        at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
        at 
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
...

Question: What is wrong? :) I use "jar (fastjar) 0.98" to create jars,
I googled on that error but with no answer gave me idea what is wrong
in my .java file.

Please help, as I believe I am close to the end of that subject.

Cheers,
Jakub Godawa.

2010/11/3 Lance Norskog<goks...@gmail.com>:
Here's the problem: Solr is a little dumb about these Filter classes,
and so you have to make a Factory object for the Stempel Filter.

There are a lot of other FilterFactory classes. You would have to just
copy one and change the names to Stempel and it might actually work.

This will take some Solr programming- perhaps the author can help you?

On Tue, Nov 2, 2010 at 7:08 AM, Jakub Godawa<jakub.god...@gmail.com>  wrote:
Sorry, I am not Java programmer at all. I would appreciate more
verbose (or step by step) help.

2010/11/2 Bernd Fehling<bernd.fehl...@uni-bielefeld.de>:
So you call org.getopt.solr.analysis.StempelTokenFilterFactory.
In this case I would assume a file StempelTokenFilterFactory.class
in your directory org/getopt/solr/analysis/.

And a class which extends the BaseTokenFilterFactory rigth?
...
public class StempelTokenFilterFactory extends BaseTokenFilterFactory 
implements ResourceLoaderAware {
...



Am 02.11.2010 14:20, schrieb Jakub Godawa:
This is what stempel-1.0.jar consist of after jar -xf:

jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
org/:
egothor  getopt

org/egothor:
stemmer

org/egothor/stemmer:
Cell.class     Diff.class    Gener.class  MultiTrie2.class
Optimizer2.class  Reduce.class        Row.class    TestAll.class
TestLoad.class  Trie$StrEnum.class
Compile.class  DiffIt.class  Lift.class   MultiTrie.class
Optimizer.class   Reduce$Remap.class  Stock.class  Test.class
Trie.class

org/getopt:
stempel

org/getopt/stempel:
Benchmark.class  lucene  Stemmer.class

org/getopt/stempel/lucene:
StempelAnalyzer.class  StempelFilter.class
jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
META-INF/:
MANIFEST.MF
jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
res:
tables

res/tables:
readme.txt  stemmer_1000.out  stemmer_100.out  stemmer_2000.out
stemmer_200.out  stemmer_500.out  stemmer_700.out

2010/11/2 Bernd Fehling<bernd.fehl...@uni-bielefeld.de>:
Hi Jakub,

if you unzip your stempel-1.0.jar do you have the
required directory structure and file in there?
org/getopt/stempel/lucene/StempelFilter.class

Regards,
Bernd

Am 02.11.2010 13:54, schrieb Jakub Godawa:
Erick I've put the jar files like that before. I also added the
directive and put the file in instanceDir/lib

What is still a problem is that even the files are loaded:
2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader replaceClassLoader
INFO: Adding 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
to classloader

I am not able to use the FilterFactory... maybe I am attempting it in
a wrong way?

Cheers,
Jakub Godawa.

2010/11/2 Erick Erickson<erickerick...@gmail.com>:
The polish stemmer jar file needs to be findable by Solr, if you copy
it to<solr_home>/lib and restart solr you should be set.

Alternatively, you can add another<lib>  directive to the solrconfig.xml
file
(there are several examples in that file already).

I'm a little confused about not being able to find TokenFilter, is that
still
a problem?

HTH
Erick

On Tue, Nov 2, 2010 at 8:07 AM, Jakub Godawa<jakub.god...@gmail.com>  wrote:

Thank you Bernd! I couldn't make it run though. Here is my problem:

1. There is a file ~/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar
2. In ~/apache-solr-1.4.1/ifaq/solr/conf/solrconfig.xml there is a
directive:<lib path="../lib/stempel-1.0.jar" />
3. In ~/apache-solr-1.4.1/ifaq/solr/conf/schema.xml there is fieldType:

(...)
  <!-- Polish -->
   <fieldType name="text_pl" class="solr.TextField">
    <analyzer>
       <tokenizer class="solr.WhitespaceTokenizerFactory"/>
      <filter class="solr.LowerCaseFilterFactory"/>
      <filter class="org.getopt.stempel.lucene.StempelFilter" />
      <!--<filter
class="org.getopt.solr.analysis.StempelTokenFilterFactory"
protected="protwords.txt" />  -->
    </analyzer>
  </fieldType>
(...)

4. jar file is loaded but I got an error:
SEVERE: Could not start SOLR. Check solr/home property
java.lang.NoClassDefFoundError: org/apache/lucene/analysis/TokenFilter
      at java.lang.ClassLoader.defineClass1(Native Method)
      at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
      at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
(...)

5. Different class gave me that one:
SEVERE: org.apache.solr.common.SolrException: Error loading class
'org.getopt.solr.analysis.StempelTokenFilterFactory'
      at
org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
      at
org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:390)
(...)

Question is: How to make<fieldType />  and<filter />  work with that
Stempel? :)

Cheers,
Jakub Godawa.

2010/10/29 Bernd Fehling<bernd.fehl...@uni-bielefeld.de>:
Hi Jakub,

I have ported the KStemmer for use in most recent Solr trunk version.
My stemmer is located in the lib directory of Solr
"solr/lib/KStemmer-2.00.jar"
because it belongs to Solr.

Write it as FilterFactory and use it as Filter like:
<filter class="de.ubbielefeld.solr.analysis.KStemFilterFactory"
protected="protwords.txt" />
This is how my fieldType looks like:

    <fieldType name="text_kstem" class="solr.TextField"
positionIncrementGap="100">
      <analyzer type="index">
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" enablePositionIncrements="false" />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="1"
catenateNumbers="1"
catenateAll="0" splitOnCaseChange="1" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="de.ubbielefeld.solr.analysis.KStemFilterFactory"
protected="protwords.txt" />
        <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
      </analyzer>
      <analyzer type="query">
        <tokenizer class="solr.WhitespaceTokenizerFactory" />
        <filter class="solr.StopFilterFactory" ignoreCase="true"
words="stopwords.txt" />
        <filter class="solr.WordDelimiterFilterFactory"
generateWordParts="1" generateNumberParts="1" catenateWords="0"
catenateNumbers="0"
catenateAll="0" splitOnCaseChange="1" />
        <filter class="solr.LowerCaseFilterFactory" />
        <filter class="de.ubbielefeld.solr.analysis.KStemFilterFactory"
protected="protwords.txt" />
        <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
      </analyzer>
    </fieldType>

Regards,
Bernd



Am 28.10.2010 14:56, schrieb Jakub Godawa:
Hi!
There is a polish stemmer http://www.getopt.org/stempel/ and I have
problems connecting it with solr 1.4.1
Questions:

1. Where EXACTLY do I put "stemper-1.0.jar" file?
2. How do I register the file, so I can build a fieldType like:

<fieldType name="text_pl" class="solr.TextField">
   <analyzer class="org.geoopt.solr.analysis.StempelTokenFilterFactory"/>
</fieldType>

3. Is that the right approach to make it work?

Thanks for verbose explanation,
Jakub.


--
Lance Norskog
goks...@gmail.com



--
Lance Norskog
goks...@gmail.com

Reply via email to