Package: libxml-dom-perl Version: 1.43-4 Severity: important *** lease type your report below this line ***
If an XML file is read and written using XML::DOM such as: ( new XML::DOM::Parser) -> parsefile ('in.xml') -> printToFile ('out.xml') ; with a simple XML file (in.xml) like: <?xml version="1.0" encoding="UTF-8"?> <blah>ã</blah> (note the non-ascii ã a with acute) the resulting output (out.xml) file is incorrectly coded with the (i presume) default locale, even though UTF-8 is stated in the XML declaration. This can be checked with xmllint: [EMAIL PROTECTED]:~$ xmllint out.xml out.xml:2: parser error : Input is not proper UTF-8, indicate encoding ! Bytes: 0xE3 0x3C 0x2F 0x62 <blah>ã</blah> ^ In addition methods such as XML::DOM::Element::getAttribute do not return UTF-8 encoded perl strings so unicode data is corrupted if read/written from the document. Regards, -- David -- System Information: Debian Release: testing/unstable APT prefers testing APT policy: (500, 'testing') Architecture: i386 (i686) Shell: /bin/sh linked to /bin/bash Kernel: Linux 2.4.27-2-386 Locale: LANG=en_GB, LC_CTYPE=en_GB (charmap=ISO-8859-1) Versions of packages libxml-dom-perl depends on: ii libwww-perl 5.803-4 WWW client/server library for Perl ii libxml-parser-perl 2.34-4 Perl module for parsing XML files ii libxml-perl 0.08-1 Perl modules for working with XML ii libxml-regexp-perl 0.03-7 Perl module for regular expression ii perl 5.8.7-3 Larry Wall's Practical Extraction libxml-dom-perl recommends no packages. -- no debconf information