I suppose you mean Extract_ing_RequestHandler.
Out of curiosity, I sent in a Japanese HTML file of EUC-JP encoding,
and it converted to Unicode properly and the index has correct
Japanese words.
Does your HTML files have META tag for Content-type with the value
having charset= ? For example, this
09 9:18 AM
To: 'solr-user@lucene.apache.org'
Subject: RE: encoding problem
Still having a few issues with encoding, although I've been able to resolve the
particular issue below by just re-editing the affected record.
The other encoding issue is with Greek characters. With sol
hough...@deakin.edu.au]
Sent: Friday, 28 August 2009 9:31 AM
To: 'solr-user@lucene.apache.org'; 'yo...@lucidimagination.com'
Subject: RE: encoding problem
Shalin, the XML from solr admin for the relevant field is displaying as -
Moncrieff, Joan, Macauley, Peter and Epps, Janine 20
Shalin, the XML from solr admin for the relevant field is displaying as -
Moncrieff, Joan, Macauley, Peter and Epps, Janine 2006, “My Universe is Here�: Implications
For the Future of Academic Libraries From the Results of a Survey of
Researchers, vol. 38, no. 2, pp. 71-83.
The wei
Message-
> From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
> Sent: Wednesday, 26 August 2009 5:50 PM
> To: solr-user@lucene.apache.org
> Subject: Re: encoding problem
>
> On Wed, Aug 26, 2009 at 12:52 PM, Bernadette Houghton <
> bernadette.hough...@deakin.edu.au> wrote:
om: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sent: Wednesday, 26 August 2009 5:50 PM
To: solr-user@lucene.apache.org
Subject: Re: encoding problem
On Wed, Aug 26, 2009 at 12:52 PM, Bernadette Houghton <
bernadette.hough...@deakin.edu.au> wrote:
> Thanks for your quick reply, Sh
If you are complaining about Web Application (other than SOLR) (probably
behind-the Apache HTTPD) having encoding problem - try to troubleshoot it
with Mozilla Firefox + Live Http Headers plugin.
Look at "Content-Encoding" HTTP response headers, and don't forget about
tag inside HTML...
-Fuad
On Wed, Aug 26, 2009 at 12:52 PM, Bernadette Houghton <
bernadette.hough...@deakin.edu.au> wrote:
> Thanks for your quick reply, Shalin.
>
> Tomcat is running on my Windows machine, but does not appear in Windows
> Services (as I was expecting it should ... am I wrong?). I'm running it from
> a st
Thanks for your quick reply, Shalin.
Tomcat is running on my Windows machine, but does not appear in Windows
Services (as I was expecting it should ... am I wrong?). I'm running it from a
startup.bat on my desktop - see below. Do I add the Dfile line to the
startup.bat?
SOLR is part of the rep
On Wed, Aug 26, 2009 at 12:42 PM, Bernadette Houghton <
bernadette.hough...@deakin.edu.au> wrote:
> Hi Shalin, stupid question - I'm an apache/solr newbie - but how do I
> access the JVM???
>
When you execute the java executable, just add -Dfile.encoding=UTF-8 as a
command line argument to the ex
Hi Shalin, stupid question - I'm an apache/solr newbie - but how do I access
the JVM???
Regards
Bern
-Original Message-
From: Shalin Shekhar Mangar [mailto:shalinman...@gmail.com]
Sent: Wednesday, 26 August 2009 5:10 PM
To: solr-user@lucene.apache.org
Subject: Re: encoding proble
On Wed, Aug 26, 2009 at 10:24 AM, Bernadette Houghton <
bernadette.hough...@deakin.edu.au> wrote:
> We have an encoding problem with our solr application. That is, non-ASCII
> chars displaying fine in SOLR, but in googledegook in our application .
>
> Our tomcat server.xml file already contains UR
Thanks,I detected that same problem.
I have CP 1252 system file encoding and was recording data-config.xml file
in UTF-8. DIH was reading using the default encoding.
One possible workarround was using InputStream and OutputStream like DIH,
but the files won't be in UTF-8 if the system has different
On Sat, Mar 28, 2009 at 12:51 AM, Shalin Shekhar Mangar <
shalinman...@gmail.com> wrote:
>
> I see that you are specifying the topologyname's value in the query itself.
> It might be a bug in DataImportHandler because it reads the data-config as a
> string from an InputStream. If your default plat
On Fri, Mar 27, 2009 at 8:41 PM, Rui Pereira wrote:
> I'm having problems with encoding in responses from search queries. The
> encoding problem only occurs in the topologyname field, if a instancename
> has accents it is returned correctly. In all my configurations I have
> UTF-8.
>
>
>
>
>
Hi,
I had the same problem with DATAIMPORTHandler : i have a utf-8 mysql
DATABASE but it's seems that DIH import data in LATIN... So i just use
Transformer to (re)encode my strings in UTF-8.
Rui Pereira-2 wrote:
>
> I'm having problems with encoding in responses from search queries. The
> encod
16 matches
Mail list logo