Re: Russian stopwords

2013-05-24 Thread igiguere
A colleague stumbled upon this : http://stackoverflow.com/questions/361975/setting-the-default-java-character-encoding The second answer, environment variable JAVA_TOOL_OPTIONS did the job. JAVA_TOOL_OPTIONS : -Dfile.encoding=UTF8 Happy stop-wording ! -- View this message in context: http:

Re: Russian stopwords

2013-05-24 Thread Alexandre Rafalovitch
Sounds like maybe UTF-specific issue when you are _reading it in_. See if you can change the default locale before starting Java Process (I think it is an environmental variable) and check if that makes an impact. If you have a very easy test-case, I would be happy to check it on Mac and Windows.

Re: Russian stopwords

2013-05-24 Thread igiguere
Just so everyone knows : It turns out my stopwords.txt was OK after all. It functions correctly on a Linux (ubuntu), and, strangely, on a colleague's Windows 7. My computer is also Windows 7. The only difference between the 2 Windows is the language of the interface (French for mine, English fo

Re: Russian stopwords

2013-05-22 Thread igiguere
I'm encountering the same issue, but, my Russian stopwords.txt IS encoded in UTF-8. I verified the encoding using EmEditor (I've used it for years, and I use it for the existing English, French, Spanish, Portuguese and German Solr configurations, without issues). Just to make extra sure, I downloa

RE: Russian stopwords

2008-12-06 Thread Lance Norskog
EMAIL PROTECTED] Sent: Saturday, December 06, 2008 1:17 AM To: solr-user@lucene.apache.org Subject: RE: Russian stopwords Hi Steve, You were right,it turned out to be a an encoding issue but a really weird one. I was using windows notepad to save the stopwords file in UTF-8 encoding. On the other h

RE: Russian stopwords

2008-12-06 Thread tushar kapoor
Hi Steve, You were right,it turned out to be a an encoding issue but a really weird one. I was using windows notepad to save the stopwords file in UTF-8 encoding. On the other hand I was using editplus to save synonyms file. That was the only difference. The moment I switched to editplus for sa

RE: Russian stopwords

2008-12-05 Thread Steven A Rowe
Hi Tushar, On 12/05/2008 at 5:18 AM, tushar kapoor wrote: > I am trying to filter russian stopwords but have not been > successful with that. [...] > words="stopwords.txt"/> >ignoreCase="true" expand="false"/> [...] > Intrestingly, Russian synonyms are work