I'm trying to employ the HunspellStemFilterFactory, but have trouble loading a dictionary.
I downloaded the .dic and .aff file for en_GB, en_US and nl_NL from the OpenOffice site, but they all give me the same error message. When I use them AS IS, I get the error message: Oct 26, 2012 2:39:37 PM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: Unable to load hunspell data! [dictionary=en_GB.dic,affix=en_GB.aff] at org.apache.solr.analysis.HunspellStemFilterFactory.inform(HunspellStemFilterFactory.java:87) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:551) .... Caused by: java.text.ParseException: The first non-comment line in the affix file must be a 'SET charset', was: 'FLAG num' at org.apache.lucene.analysis.hunspell.HunspellDictionary.getDictionaryEncoding(HunspellDictionary.java:280) at org.apache.lucene.analysis.hunspell.HunspellDictionary.<init>(HunspellDictionary.java:112) at org.apache.solr.analysis.HunspellStemFilterFactory.inform(HunspellStemFilterFactory.java:85) ... 32 more When I add the following first line to both the .dic and the .aff file: SET UTF-8 The error message changes into: Oct 26, 2012 10:16:42 AM org.apache.solr.common.SolrException log SEVERE: java.lang.RuntimeException: Unable to load hunspell data! [dictionary=en_GB.dic,affix=en_GB.aff] at org.apache.solr.analysis.HunspellStemFilterFactory.infoOSX 10.7.5rm(HunspellStemFilterFactory.java:87) at org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:551) .... Caused by: java.nio.charset.IllegalCharsetNameException: 'UTF-8' at java.nio.charset.Charset.checkName(Charset.java:284) at java.nio.charset.Charset.lookup2(Charset.java:458) at java.nio.charset.Charset.lookup(Charset.java:437) at java.nio.charset.Charset.forName(Charset.java:502) at org.apache.lucene.analysis.hunspell.HunspellDictionary.getJavaEncoding(HunspellDictionary.java:293) at org.apache.lucene.analysis.hunspell.HunspellDictionary.<init>(HunspellDictionary.java:113) at org.apache.solr.analysis.HunspellStemFilterFactory.inform(HunspellStemFilterFactory.java:85) ... 32 more I am aware of a similar issue that was raised on this list in 12-2011, which was escalated to the Jiria list (https://issues.apache.org/jira/browse/SOLR-2934), but am not sure if that was ever resolved. Or am I just missing something? In either case, could anyone who has working dictionary files share them with me (any old language; as long as it works!) I am using Solr 3.6.1 on a Mac running OSX 10.7.5 - Rob