Package: python-html5lib
Version: 0.90-2

import html5lib
html5lib.parse('<div><div><a/</div></div>\n', treebuilder='lxml')
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
  File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 38, in parse
    return p.parse(doc, encoding=encoding)
  File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 211, in 
parse
    parseMeta=parseMeta, useChardet=useChardet)
  File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 111, in 
_parse
    self.mainLoop()
  File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 176, in 
mainLoop
    self.phase.processSpaceCharacters(token)
  File "/usr/lib/pymodules/python2.7/html5lib/html5parser.py", line 952, in 
processSpaceCharacters
    self.tree.reconstructActiveFormattingElements()
  File "/usr/lib/pymodules/python2.7/html5lib/treebuilders/_base.py", line 181, 
in reconstructActiveFormattingElements
    clone = entry.cloneNode() #Mainly to get a new copy of the attributes
  File "/usr/lib/pymodules/python2.7/html5lib/treebuilders/etree.py", line 136, 
in cloneNode
    element.attributes[name] = value
  File "lxml.etree.pyx", line 2145, in lxml.etree._Attrib.__setitem__ 
(src/lxml/lxml.etree.c:46818)
  File "apihelpers.pxi", line 558, in lxml.etree._setAttributeValue 
(src/lxml/lxml.etree.c:15734)
  File "apihelpers.pxi", line 1554, in lxml.etree._attributeValidOrRaise 
(src/lxml/lxml.etree.c:24197)
ValueError: Invalid attribute name u'<'


Funnily enough, the problem goes away if I remove the trailing newline.

-- System Information:
Debian Release: wheezy/sid
  APT prefers unstable
  APT policy: (990, 'unstable'), (500, 'experimental')
Architecture: i386 (x86_64)

Kernel: Linux 3.0.0-1-amd64 (SMP w/2 CPU cores)
Locale: LANG=C, LC_CTYPE=pl_PL.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/dash

Versions of packages python-html5lib depends on:
ii  python                        2.7.2-5    interactive high-level object-orie
ii  python-support                1.0.14     automated rebuilding support for P

Versions of packages python-html5lib suggests:
pn  python-beautifulsoup          <none>     (no description available)
ii  python-chardet                2.0.1-2    universal character encoding detec
pn  python-genshi                 <none>     (no description available)
ii  python-lxml                   2.3-0.1+b2 pythonic binding for the libxml2 a

--
Jakub Wilk



--
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to