Jason Sachs added the comment:
sample file test1.html attached.
When running test2.py on it, the output is identical except for two things:
test1.html contains
test1b.html contains
test1.html contains end tags that are capitalized e.g. or have spaces
test1b.html contains end tags that
New submission from Jason Sachs:
The HTMLParser class (https://docs.python.org/2/library/htmlparser.html) is
lacking a few features to reconstruct input exactly. For the most part it can
do this, but I found two items where it falls short (there may be others):
- There is a get_starttag_text
Jason Sachs added the comment:
sample file attached containing VerbatimParser
--
Added file: http://bugs.python.org/file41496/test2.py
___
Python tracker
<http://bugs.python.org/issue26