Re: AW: UTF-8 2-byte vs 4-byte encodings

2007-05-02 Thread Gereon Steffens
Hi Chrisitian, > It is not sufficient to set the encoding in the XML but > you need an additional HTTP header to set the encoding ("Content-type: > text/xml; charset=UTF-8") Thanks, that's what I was missing. Gereon

AW: UTF-8 2-byte vs 4-byte encodings

2007-05-02 Thread Burkamp, Christian
xt/xml; charset=UTF-8") --Christian -Ursprüngliche Nachricht- Von: Gereon Steffens [mailto:[EMAIL PROTECTED] Gesendet: Mittwoch, 2. Mai 2007 09:59 An: solr-user@lucene.apache.org Betreff: UTF-8 2-byte vs 4-byte encodings Hi, I have a question regarding UTF-8 encodings, illu

UTF-8 2-byte vs 4-byte encodings

2007-05-02 Thread Gereon Steffens
Hi, I have a question regarding UTF-8 encodings, illustrated by the utf8-example.xml file. This file contains raw, unescaped UTF8 characters, for example the "e acute" character, represented as two bytes 0xC3 0xA9. When this file is added to Solar and retrieved later, the XML output contains a fou