Specifying the file.encoding did work, although I don't think it is a suitable workaround for my use case. Any idea what my next step is to having a bug opened.
Thanks, Joe > Date: Wed, 18 Nov 2009 16:15:55 +0530 > Subject: Re: UTF-8 Character Set not specifed on OutputStreamWriter in > StreamingUpdateSolrServer > From: shalinman...@gmail.com > To: solr-user@lucene.apache.org > > On Wed, Nov 18, 2009 at 6:56 AM, Joe Kessel <isjust...@hotmail.com> wrote: > > > > > While trying to make use of the StreamingUpdateSolrServer for updates with > > the release code for Solr.14 I noticed some characters such as é did not > > show up in the index correctly. The code should set the CharsetName via the > > constructor of the OutputStreamWriter. I noticed that the > > CommonsHttpSolrServer seems to set the charset to UTF-8. As a workaround I > > am able to use the CommonsHttpSolrServer. Being new to Solr, not sure what > > the bug protocol is, assuming this is a bug. > > > > > I wrote a simple test case and I'm able to index and query 'é' and other > characters using StreamingUpdateSolrServer. Can you use -Dfile.encoding=UTF8 > as a JVM parameter and see if that fixes your case. If it does, then it may > be a Solr bug. > > -- > Regards, > Shalin Shekhar Mangar. _________________________________________________________________ Hotmail: Trusted email with powerful SPAM protection. http://clk.atdmt.com/GBL/go/177141665/direct/01/