Stefan Behnel <[EMAIL PROTECTED]> wrote: > Stefan Scholl wrote: >> Stefan Behnel <[EMAIL PROTECTED]> wrote: >>> Stefan Scholl wrote: >>>> Stefan Behnel <[EMAIL PROTECTED]> wrote: >>>>> Stefan Scholl wrote: >>>>>> Well, http://docs.python.org/lib/module-xml.sax.html is missing >>>>>> the fact, that I can't use Unicode with parseString(). >>>>>> >>>>>> This parseString() uses cStringIO. >>>>> Well, Python unicode is not a valid *byte* encoding for XML. >>>>> >>>>> lxml.etree can parse unicode, if you really want, but otherwise, you >>>>> should >>>>> maybe stick to well-formed XML. >>>> The XML is well-formed. Works perfect in Python 2.4 with Python >>>> unicode and Python sax parser. >>> The XML is *not* well-formed if you pass Python unicode instead of a byte >>> encoded string. Read the XML spec. >>> >>> It would be well-formed if you added the proper XML declaration, but that is >>> system specific (UCS-4 or UTF-16, BE or LE). So don't even try. >> >> Who cares? I'm not calling any external tools. > > XML cares. If you want to work with something that is not XML, do not expect > XML tools to help you do it. XML tools work with XML, and there is a spec that > says what XML is. Your string is not XML.
This isn't some sophisticated XML tool that tells me the string is wrong. It's a changed behavior of cStringIO that throws an exception. While I'm just using the method parseString() of xml.sax. We both repeat ourselves. I don't think this thread brings something new. I'm all for correct XML and hate XML bozos. But there are limits you have to learn after a few years. -- http://mail.python.org/mailman/listinfo/python-list
