Hi,

I'm having issues with special characters in synonyms.txt on Solr 3.5.

I'm running a multi-lingual index and need certain terms to give results across 
all languages no matter what language the user uses.
I figured that this should be easily resovled by just adding the different 
words to synonyms.txt.
This works great as long as I don't use special characters such as åäö.

I've tried a couple of things so far but now I'm completely stuck.

This is completetly ignored by solr:
island, "\u00F6"
and alternatively:
island, "\u00F6n" (this should translate to "ön" which means "the island")

A search for island gives me results with the word " island " but not containg 
the word "ö" (island in swedish) and vice versa.

Directly injecting the letter "ö" into synonyms like so:
island, ön
island, "ön"

renders the following exception on startup (both lines renders the same error):

java.lang.RuntimeException: java.nio.charset.MalformedInputException: Input 
length = 3
                             at 
org.apache.solr.analysis.FSTSynonymFilterFactory.inform(FSTSynonymFilterFactory.java:92)
                             at 
org.apache.solr.analysis.SynonymFilterFactory.inform(SynonymFilterFactory.java:50)
                             at 
org.apache.solr.core.SolrResourceLoader.inform(SolrResourceLoader.java:546)
                             at 
org.apache.solr.schema.IndexSchema.<init>(IndexSchema.java:126)
                             at 
org.apache.solr.core.CoreContainer.create(CoreContainer.java:461)
                             at 
org.apache.solr.core.CoreContainer.load(CoreContainer.java:316)
                             at 
org.apache.solr.core.CoreContainer.load(CoreContainer.java:207)
                             at 
org.apache.solr.core.CoreContainer$Initializer.initialize(CoreContainer.java:130)
                             at 
org.apache.solr.servlet.SolrDispatchFilter.init(SolrDispatchFilter.java:94)
                             at 
org.mortbay.jetty.servlet.FilterHolder.doStart(FilterHolder.java:97)
                             at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
                             at 
org.mortbay.jetty.servlet.ServletHandler.initialize(ServletHandler.java:713)
                             at 
org.mortbay.jetty.servlet.Context.startContext(Context.java:140)
                             at 
org.mortbay.jetty.webapp.WebAppContext.startContext(WebAppContext.java:1282)
                             at 
org.mortbay.jetty.handler.ContextHandler.doStart(ContextHandler.java:518)
                             at 
org.mortbay.jetty.webapp.WebAppContext.doStart(WebAppContext.java:499)
                             at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
                             at 
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
                             at 
org.mortbay.jetty.handler.ContextHandlerCollection.doStart(ContextHandlerCollection.java:156)
                             at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
                             at 
org.mortbay.jetty.handler.HandlerCollection.doStart(HandlerCollection.java:152)
                             at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
                             at 
org.mortbay.jetty.handler.HandlerWrapper.doStart(HandlerWrapper.java:130)
                             at 
org.mortbay.jetty.Server.doStart(Server.java:224)
                             at 
org.mortbay.component.AbstractLifeCycle.start(AbstractLifeCycle.java:50)
                             at 
org.mortbay.xml.XmlConfiguration.main(XmlConfiguration.java:985)
                             at 
sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
                             at 
sun.reflect.NativeMethodAccessorImpl.invoke(Unknown Source)
                             at 
sun.reflect.DelegatingMethodAccessorImpl.invoke(Unknown Source)
                             at java.lang.reflect.Method.invoke(Unknown Source)
                             at org.mortbay.start.Main.invokeMain(Main.java:194)
                             at org.mortbay.start.Main.start(Main.java:534)
                             at org.mortbay.start.Main.start(Main.java:441)
                             at org.mortbay.start.Main.main(Main.java:119)
Caused by: java.nio.charset.MalformedInputException: Input length = 3
                             at 
java.nio.charset.CoderResult.throwException(Unknown Source)
                             at sun.nio.cs.StreamDecoder.implRead(Unknown 
Source)
                             at sun.nio.cs.StreamDecoder.read(Unknown Source)
                             at java.io.InputStreamReader.read(Unknown Source)
                             at java.io.BufferedReader.fill(Unknown Source)
                             at java.io.BufferedReader.readLine(Unknown Source)
                             at java.io.LineNumberReader.readLine(Unknown 
Source)
                             at 
org.apache.lucene.analysis.synonym.SolrSynonymParser.addInternal(SolrSynonymParser.java:82)
                             at 
org.apache.lucene.analysis.synonym.SolrSynonymParser.add(SolrSynonymParser.java:70)
                             at 
org.apache.solr.analysis.FSTSynonymFilterFactory.loadSolrSynonyms(FSTSynonymFilterFactory.java:122)
                             at 
org.apache.solr.analysis.FSTSynonymFilterFactory.inform(FSTSynonymFilterFactory.java:84)
                             ... 33 more

Does anyone have any ideas on how to solve this issue?

Thanks,
Carl

Reply via email to