[issue26009] HTMLParser lacking a few features to reconstruct input exactly

2016-01-04 Thread Jason Sachs
Jason Sachs added the comment: sample file test1.html attached. When running test2.py on it, the output is identical except for two things: test1.html contains test1b.html contains test1.html contains end tags that are capitalized e.g. or have spaces test1b.html contains end tags that

[issue26009] HTMLParser lacking a few features to reconstruct input exactly

2016-01-04 Thread Jason Sachs
New submission from Jason Sachs: The HTMLParser class (https://docs.python.org/2/library/htmlparser.html) is lacking a few features to reconstruct input exactly. For the most part it can do this, but I found two items where it falls short (there may be others): - There is a get_starttag_text

[issue26009] HTMLParser lacking a few features to reconstruct input exactly

2016-01-04 Thread Jason Sachs
Jason Sachs added the comment: sample file attached containing VerbatimParser -- Added file: http://bugs.python.org/file41496/test2.py ___ Python tracker <http://bugs.python.org/issue26