Hi! Sorry for such a break, but I was moving house... anyway:
1. I took the
~/apache-solr/src/java/org/apache/solr/analysis/StandardFilterFactory.java
file and modified it (named as StempelFilterFactory.java) in Vim that
way:
package org.getopt.solr.analysis;
import org.apache.lucene.analysis.TokenStream;
import org.apache.lucene.analysis.standard.StandardFilter;
public class StempelTokenFilterFactory extends BaseTokenFilterFactory {
public StempelFilter create(TokenStream input) {
return new StempelFilter(input);
}
}
2. Then I put the file to the extracted stempel-1.0.jar in
./org/getopt/solr/analysis/
3. Then I created a class from it: jar -cf
StempelTokenFilterFactory.class StempelFilterFactory.java
4. Then I created new stempel-1.0.jar archive: jar -cf stempel-1.0.jar
-C ./stempel-1.0/ .
5. Then in schema.xml I've put:
<fieldType name="text_pl" class="solr.TextField">
<analyzer>
<tokenizer class="solr.WhitespaceTokenizerFactory"/>
<filter class="solr.LowerCaseFilterFactory"/>
<filter class="org.getopt.solr.analysis.StempelTokenFilterFactory" />
</analyzer>
</fieldType>
6. I started the solr server and I recieved the following error:
2010-11-11 11:50:56 org.apache.solr.common.SolrException log
SEVERE: java.lang.ClassFormatError: Incompatible magic value
1347093252 in class file
org/getopt/solr/analysis/StempelTokenFilterFactory
at java.lang.ClassLoader.defineClass1(Native Method)
at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
at
java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
...
Question: What is wrong? :) I use "jar (fastjar) 0.98" to create jars,
I googled on that error but with no answer gave me idea what is wrong
in my .java file.
Please help, as I believe I am close to the end of that subject.
Cheers,
Jakub Godawa.
2010/11/3 Lance Norskog <[email protected]>:
> Here's the problem: Solr is a little dumb about these Filter classes,
> and so you have to make a Factory object for the Stempel Filter.
>
> There are a lot of other FilterFactory classes. You would have to just
> copy one and change the names to Stempel and it might actually work.
>
> This will take some Solr programming- perhaps the author can help you?
>
> On Tue, Nov 2, 2010 at 7:08 AM, Jakub Godawa <[email protected]> wrote:
>> Sorry, I am not Java programmer at all. I would appreciate more
>> verbose (or step by step) help.
>>
>> 2010/11/2 Bernd Fehling <[email protected]>:
>>>
>>> So you call org.getopt.solr.analysis.StempelTokenFilterFactory.
>>> In this case I would assume a file StempelTokenFilterFactory.class
>>> in your directory org/getopt/solr/analysis/.
>>>
>>> And a class which extends the BaseTokenFilterFactory rigth?
>>> ...
>>> public class StempelTokenFilterFactory extends BaseTokenFilterFactory
>>> implements ResourceLoaderAware {
>>> ...
>>>
>>>
>>>
>>> Am 02.11.2010 14:20, schrieb Jakub Godawa:
>>>> This is what stempel-1.0.jar consist of after jar -xf:
>>>>
>>>> jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R org/
>>>> org/:
>>>> egothor getopt
>>>>
>>>> org/egothor:
>>>> stemmer
>>>>
>>>> org/egothor/stemmer:
>>>> Cell.class Diff.class Gener.class MultiTrie2.class
>>>> Optimizer2.class Reduce.class Row.class TestAll.class
>>>> TestLoad.class Trie$StrEnum.class
>>>> Compile.class DiffIt.class Lift.class MultiTrie.class
>>>> Optimizer.class Reduce$Remap.class Stock.class Test.class
>>>> Trie.class
>>>>
>>>> org/getopt:
>>>> stempel
>>>>
>>>> org/getopt/stempel:
>>>> Benchmark.class lucene Stemmer.class
>>>>
>>>> org/getopt/stempel/lucene:
>>>> StempelAnalyzer.class StempelFilter.class
>>>> jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R META-INF/
>>>> META-INF/:
>>>> MANIFEST.MF
>>>> jgod...@ubuntu:~/apache-solr-1.4.1/ifaq/lib$ ls -R res
>>>> res:
>>>> tables
>>>>
>>>> res/tables:
>>>> readme.txt stemmer_1000.out stemmer_100.out stemmer_2000.out
>>>> stemmer_200.out stemmer_500.out stemmer_700.out
>>>>
>>>> 2010/11/2 Bernd Fehling <[email protected]>:
>>>>> Hi Jakub,
>>>>>
>>>>> if you unzip your stempel-1.0.jar do you have the
>>>>> required directory structure and file in there?
>>>>> org/getopt/stempel/lucene/StempelFilter.class
>>>>>
>>>>> Regards,
>>>>> Bernd
>>>>>
>>>>> Am 02.11.2010 13:54, schrieb Jakub Godawa:
>>>>>> Erick I've put the jar files like that before. I also added the
>>>>>> directive and put the file in instanceDir/lib
>>>>>>
>>>>>> What is still a problem is that even the files are loaded:
>>>>>> 2010-11-02 13:20:48 org.apache.solr.core.SolrResourceLoader
>>>>>> replaceClassLoader
>>>>>> INFO: Adding
>>>>>> 'file:/home/jgodawa/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar'
>>>>>> to classloader
>>>>>>
>>>>>> I am not able to use the FilterFactory... maybe I am attempting it in
>>>>>> a wrong way?
>>>>>>
>>>>>> Cheers,
>>>>>> Jakub Godawa.
>>>>>>
>>>>>> 2010/11/2 Erick Erickson <[email protected]>:
>>>>>>> The polish stemmer jar file needs to be findable by Solr, if you copy
>>>>>>> it to <solr_home>/lib and restart solr you should be set.
>>>>>>>
>>>>>>> Alternatively, you can add another <lib> directive to the solrconfig.xml
>>>>>>> file
>>>>>>> (there are several examples in that file already).
>>>>>>>
>>>>>>> I'm a little confused about not being able to find TokenFilter, is that
>>>>>>> still
>>>>>>> a problem?
>>>>>>>
>>>>>>> HTH
>>>>>>> Erick
>>>>>>>
>>>>>>> On Tue, Nov 2, 2010 at 8:07 AM, Jakub Godawa <[email protected]>
>>>>>>> wrote:
>>>>>>>
>>>>>>>> Thank you Bernd! I couldn't make it run though. Here is my problem:
>>>>>>>>
>>>>>>>> 1. There is a file ~/apache-solr-1.4.1/ifaq/lib/stempel-1.0.jar
>>>>>>>> 2. In ~/apache-solr-1.4.1/ifaq/solr/conf/solrconfig.xml there is a
>>>>>>>> directive: <lib path="../lib/stempel-1.0.jar" />
>>>>>>>> 3. In ~/apache-solr-1.4.1/ifaq/solr/conf/schema.xml there is fieldType:
>>>>>>>>
>>>>>>>> (...)
>>>>>>>> <!-- Polish -->
>>>>>>>> <fieldType name="text_pl" class="solr.TextField">
>>>>>>>> <analyzer>
>>>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory"/>
>>>>>>>> <filter class="solr.LowerCaseFilterFactory"/>
>>>>>>>> <filter class="org.getopt.stempel.lucene.StempelFilter" />
>>>>>>>> <!-- <filter
>>>>>>>> class="org.getopt.solr.analysis.StempelTokenFilterFactory"
>>>>>>>> protected="protwords.txt" /> -->
>>>>>>>> </analyzer>
>>>>>>>> </fieldType>
>>>>>>>> (...)
>>>>>>>>
>>>>>>>> 4. jar file is loaded but I got an error:
>>>>>>>> SEVERE: Could not start SOLR. Check solr/home property
>>>>>>>> java.lang.NoClassDefFoundError: org/apache/lucene/analysis/TokenFilter
>>>>>>>> at java.lang.ClassLoader.defineClass1(Native Method)
>>>>>>>> at java.lang.ClassLoader.defineClass(ClassLoader.java:634)
>>>>>>>> at
>>>>>>>> java.security.SecureClassLoader.defineClass(SecureClassLoader.java:142)
>>>>>>>> (...)
>>>>>>>>
>>>>>>>> 5. Different class gave me that one:
>>>>>>>> SEVERE: org.apache.solr.common.SolrException: Error loading class
>>>>>>>> 'org.getopt.solr.analysis.StempelTokenFilterFactory'
>>>>>>>> at
>>>>>>>> org.apache.solr.core.SolrResourceLoader.findClass(SolrResourceLoader.java:375)
>>>>>>>> at
>>>>>>>> org.apache.solr.core.SolrResourceLoader.newInstance(SolrResourceLoader.java:390)
>>>>>>>> (...)
>>>>>>>>
>>>>>>>> Question is: How to make <fieldType /> and <filter /> work with that
>>>>>>>> Stempel? :)
>>>>>>>>
>>>>>>>> Cheers,
>>>>>>>> Jakub Godawa.
>>>>>>>>
>>>>>>>> 2010/10/29 Bernd Fehling <[email protected]>:
>>>>>>>>> Hi Jakub,
>>>>>>>>>
>>>>>>>>> I have ported the KStemmer for use in most recent Solr trunk version.
>>>>>>>>> My stemmer is located in the lib directory of Solr
>>>>>>>> "solr/lib/KStemmer-2.00.jar"
>>>>>>>>> because it belongs to Solr.
>>>>>>>>>
>>>>>>>>> Write it as FilterFactory and use it as Filter like:
>>>>>>>>> <filter class="de.ubbielefeld.solr.analysis.KStemFilterFactory"
>>>>>>>> protected="protwords.txt" />
>>>>>>>>>
>>>>>>>>> This is how my fieldType looks like:
>>>>>>>>>
>>>>>>>>> <fieldType name="text_kstem" class="solr.TextField"
>>>>>>>> positionIncrementGap="100">
>>>>>>>>> <analyzer type="index">
>>>>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>>>>>>>> <filter class="solr.StopFilterFactory" ignoreCase="true"
>>>>>>>> words="stopwords.txt" enablePositionIncrements="false" />
>>>>>>>>> <filter class="solr.WordDelimiterFilterFactory"
>>>>>>>> generateWordParts="1" generateNumberParts="1" catenateWords="1"
>>>>>>>> catenateNumbers="1"
>>>>>>>>> catenateAll="0" splitOnCaseChange="1" />
>>>>>>>>> <filter class="solr.LowerCaseFilterFactory" />
>>>>>>>>> <filter class="de.ubbielefeld.solr.analysis.KStemFilterFactory"
>>>>>>>> protected="protwords.txt" />
>>>>>>>>> <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
>>>>>>>>> </analyzer>
>>>>>>>>> <analyzer type="query">
>>>>>>>>> <tokenizer class="solr.WhitespaceTokenizerFactory" />
>>>>>>>>> <filter class="solr.StopFilterFactory" ignoreCase="true"
>>>>>>>> words="stopwords.txt" />
>>>>>>>>> <filter class="solr.WordDelimiterFilterFactory"
>>>>>>>> generateWordParts="1" generateNumberParts="1" catenateWords="0"
>>>>>>>> catenateNumbers="0"
>>>>>>>>> catenateAll="0" splitOnCaseChange="1" />
>>>>>>>>> <filter class="solr.LowerCaseFilterFactory" />
>>>>>>>>> <filter class="de.ubbielefeld.solr.analysis.KStemFilterFactory"
>>>>>>>> protected="protwords.txt" />
>>>>>>>>> <filter class="solr.RemoveDuplicatesTokenFilterFactory" />
>>>>>>>>> </analyzer>
>>>>>>>>> </fieldType>
>>>>>>>>>
>>>>>>>>> Regards,
>>>>>>>>> Bernd
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Am 28.10.2010 14:56, schrieb Jakub Godawa:
>>>>>>>>>> Hi!
>>>>>>>>>> There is a polish stemmer http://www.getopt.org/stempel/ and I have
>>>>>>>>>> problems connecting it with solr 1.4.1
>>>>>>>>>> Questions:
>>>>>>>>>>
>>>>>>>>>> 1. Where EXACTLY do I put "stemper-1.0.jar" file?
>>>>>>>>>> 2. How do I register the file, so I can build a fieldType like:
>>>>>>>>>>
>>>>>>>>>> <fieldType name="text_pl" class="solr.TextField">
>>>>>>>>>> <analyzer
>>>>>>>>>> class="org.geoopt.solr.analysis.StempelTokenFilterFactory"/>
>>>>>>>>>> </fieldType>
>>>>>>>>>>
>>>>>>>>>> 3. Is that the right approach to make it work?
>>>>>>>>>>
>>>>>>>>>> Thanks for verbose explanation,
>>>>>>>>>> Jakub.
>>>
>>
>
>
>
> --
> Lance Norskog
> [email protected]
>