Ashwin Ramaswami <[email protected]> added the comment:
Oh, both the Travis links I sent actually ended up reproducing the bug.
I've made a PR that fixes with an even smaller test case:
get_unstructured('=?utf-8?q?somevalue?=aa')
It looks like this is caused because "aa" is thought to be an encoded word
escape in
https://github.com/python/cpython/blob/fd5a82a7685d1599aab12e722a383cb0a2adfd8a/Lib/email/_header_value_parser.py#L1042
-- thus, get_encoded_word fails, which ends up making get_unstructured go in
an infinite loop.
My PR makes the parser parse "=?utf-8?q?somevalue?=aa" as
"=?utf-8?q?somevalue?=aa". However, the existing test cases make sure it parses
"=?utf-8?q?somevalue?=nowhitespace" as "somevaluenowhitespace". I'm not too
familiar with RFC 2047, but why are "aa" and "nowhitespace" treated
differently? Should they be?
----------
_______________________________________
Python tracker <[email protected]>
<https://bugs.python.org/issue37764>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
https://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com