Re: UTF-8 support during indexing content

2012-02-01 Thread Chris Hostetter
: Subject: UTF-8 support during indexing content : References: <8ce9f966c6f6769-19a0-9e...@webmail-m069.sysops.aol.com> : <1326447127.1952.10.camel@snape> : <8ceade0f7e0ecec-189c-c...@webmail-m069.sysops.aol.com> : <1328105200.2033.33.camel@snape> : In-Reply-To: <1328105200.2033.33.camel@snape>

RE: UTF-8 support during indexing content

2012-02-01 Thread Van Tassell, Kristian
--Original Message- From: Travis Low [mailto:t...@4centurion.com] Sent: Wednesday, February 01, 2012 8:27 AM To: solr-user@lucene.apache.org Subject: Re: UTF-8 support during indexing content Are you sure the input document is in UTF-8? That looks like classic ISO-8859-1-treated-as-UTF-8. How d

Re: UTF-8 support during indexing content

2012-02-01 Thread Travis Low
Are you sure the input document is in UTF-8? That looks like classic ISO-8859-1-treated-as-UTF-8. How did you confirm the document contains the right quote marks immediately prior to uploading? If you just visually inspected it, then use whatever tool you viewed it in to see what the character s