Package: python-html5lib Version: 0.90-2
import html5lib html5lib.parse('<div><div><a/</div></div>\n', treebuilder='lxml')
Traceback (most recent call last): File "<stdin>", line 1, in <module> File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 38, in parse return p.parse(doc, encoding=encoding) File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 211, in parse parseMeta=parseMeta, useChardet=useChardet) File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 111, in _parse self.mainLoop() File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 176, in mainLoop self.phase.processSpaceCharacters(token) File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 952, in processSpaceCharacters self.tree.reconstructActiveFormattingElements() File "/usr/lib/pymodules/python2.7/html5lib/treebuilders/_base.py", line 181, in reconstructActiveFormattingElements clone = entry.cloneNode() #Mainly to get a new copy of the attributes File "/usr/lib/pymodules/python2.7/html5lib/treebuilders/etree.py", line 136, in cloneNode element.attributes[name] = value File "lxml.etree.pyx", line 2145, in lxml.etree._Attrib.__setitem__ (src/lxml/lxml.etree.c:46818) File "apihelpers.pxi", line 558, in lxml.etree._setAttributeValue (src/lxml/lxml.etree.c:15734) File "apihelpers.pxi", line 1554, in lxml.etree._attributeValidOrRaise (src/lxml/lxml.etree.c:24197) ValueError: Invalid attribute name u'<'
Funnily enough, the problem goes away if I remove the trailing newline. -- System Information: Debian Release: wheezy/sid APT prefers unstable APT policy: (990, 'unstable'), (500, 'experimental') Architecture: i386 (x86_64) Kernel: Linux 3.0.0-1-amd64 (SMP w/2 CPU cores) Locale: LANG=C, LC_CTYPE=pl_PL.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/dash Versions of packages python-html5lib depends on: ii python 2.7.2-5 interactive high-level object-orie ii python-support 1.0.14 automated rebuilding support for P Versions of packages python-html5lib suggests: pn python-beautifulsoup <none> (no description available) ii python-chardet 2.0.1-2 universal character encoding detec pn python-genshi <none> (no description available) ii python-lxml 2.3-0.1+b2 pythonic binding for the libxml2 a -- Jakub Wilk -- To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org