[issue1079] decode_header does not follow RFC 2047

2012-06-03 Thread Roundup Robot
Roundup Robot added the comment: New changeset 0808cb8c60fd by R David Murray in branch 'default': #2658: Add test for issue fixed by fix for #1079. http://hg.python.org/cpython/rev/0808cb8c60fd -- ___ Python tracker

[issue1079] decode_header does not follow RFC 2047

2012-06-03 Thread R. David Murray
R. David Murray added the comment: OK, I'm closing this, then, and will close the related issues as well. Thanks again for the patch, Ralf. -- resolution: -> fixed stage: patch review -> committed/rejected status: open -> closed versions: -Python 2.7, Python 3.2

[issue1079] decode_header does not follow RFC 2047

2012-06-03 Thread Barry A. Warsaw
Barry A. Warsaw added the comment: On Jun 02, 2012, at 09:59 PM, R. David Murray wrote: >I've applied this to 3.3. Because the preservation of spaces around the >ascii parts is a visible behavior change that could cause working programs to >break, I don't think I can backport it. I'm going to

[issue1079] decode_header does not follow RFC 2047

2012-06-02 Thread R. David Murray
R. David Murray added the comment: I've applied this to 3.3. Because the preservation of spaces around the ascii parts is a visible behavior change that could cause working programs to break, I don't think I can backport it. I'm going to leave this open until I can consult with Barry to see

[issue1079] decode_header does not follow RFC 2047

2012-06-02 Thread Roundup Robot
Roundup Robot added the comment: New changeset 8c03fe231877 by R David Murray in branch 'default': #1079: Fix parsing of encoded words. http://hg.python.org/cpython/rev/8c03fe231877 -- nosy: +python-dev ___ Python tracker

[issue1079] decode_header does not follow RFC 2047

2012-05-29 Thread Ralf Schlatterbeck
Ralf Schlatterbeck added the comment: On Mon, May 28, 2012 at 08:15:05PM +, R. David Murray wrote: > > R. David Murray added the comment: > > Ralf, thanks very much for this patch. I'm considering applying it. > Given that the current code breaks on parsing various legitimate > construct

[issue1079] decode_header does not follow RFC 2047

2012-05-28 Thread R. David Murray
R. David Murray added the comment: Ralf, thanks very much for this patch. I'm considering applying it. Given that the current code breaks on parsing various legitimate constructs, it seems like the behavior change (preserving whitespace in the non-EW parts...which IMO is correct) should be

[issue1079] decode_header does not follow RFC 2047

2012-04-20 Thread Patrick Hahn
Changes by Patrick Hahn : -- nosy: +phahn ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail.python.org

[issue1079] decode_header does not follow RFC 2047

2012-01-03 Thread Ralf Schlatterbeck
Ralf Schlatterbeck added the comment: Attached please find a patch that - keeps all spaces between non-encoded and encoded parts - doesn't create spaces between non-encoded and encoded parts in case these are already there or not needed (because they are non-ctext characters of RFC822 like '

[issue1079] decode_header does not follow RFC 2047

2012-01-03 Thread R. David Murray
R. David Murray added the comment: Gah, that's what I get for not reading carefully (or looking at the patch first). Your test change is fine, of course. -- ___ Python tracker

[issue1079] decode_header does not follow RFC 2047

2012-01-03 Thread R. David Murray
R. David Murray added the comment: Well, a caution that tweaking the regex can have unexpected consequences as past issues have proven (but by all means go for it), and a note that the parsing strategy is going to change completely in email6 (see http://pypi.python.org/email and http://hg.pyt

[issue1079] decode_header does not follow RFC 2047

2012-01-03 Thread Ralf Schlatterbeck
Ralf Schlatterbeck added the comment: enclosed please find a fixed patch -- decode_header consolidates multiple encoded strings with the same encoding into a single entry in the returned parts. -- Dr. Ralf Schlatterbeck Tel: +43/2243/26465-16 Open Source Consulting

[issue1079] decode_header does not follow RFC 2047

2012-01-03 Thread Ralf Schlatterbeck
Ralf Schlatterbeck added the comment: Fine, I see what you mean, this involves very careful reading of the RFC and could have been a little more verbose ... Right. Should have been a ')' > Adding the RFC tests would be great (patches gladly accepted). Fixes > for ones we fail would be great,

[issue1079] decode_header does not follow RFC 2047

2012-01-02 Thread R. David Murray
R. David Murray added the comment: The RFC isn't at all vague about encoded words not separated by white space. That isn't allowed by the BNF. As you say, though, they occur in the wild and should be parsed correctly. In your other point I think you mean "immediately followed by a )", right

[issue1079] decode_header does not follow RFC 2047

2012-01-02 Thread Ralf Schlatterbeck
Ralf Schlatterbeck added the comment: maybe it would be a good start to include the examples at the end of RFC2047 into the regression tests? These examples at least support the case that a '?' may immediately follow an encoded string: encoded formdisplayed as

[issue1079] decode_header does not follow RFC 2047

2011-03-13 Thread R. David Murray
Changes by R. David Murray : -- versions: +Python 3.3 ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mai

[issue1079] decode_header does not follow RFC 2047

2010-11-30 Thread R. David Murray
Changes by R. David Murray : -- assignee: -> r.david.murray ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: htt

[issue1079] decode_header does not follow RFC 2047

2010-09-26 Thread Tokio Kikuchi
Tokio Kikuchi added the comment: Hi, all I am against applying these patches because they will insert space separations in re-composed header (with str() function). Sm=?ISO-8859-1?B?9g==?=rg=?ISO-8859-1?B?5Q==?=sbord -> [('Sm', None), ('\xf6', 'iso-8859-1'), ('rg', None), ('\xe5', 'iso-8859-

[issue1079] decode_header does not follow RFC 2047

2010-09-18 Thread Mark Lawrence
Changes by Mark Lawrence : -- stage: needs patch -> patch review versions: +Python 3.2 -Python 2.6, Python 3.0 ___ Python tracker ___ _

[issue1079] decode_header does not follow RFC 2047

2010-04-10 Thread Oliver Martin
Oliver Martin added the comment: I got bitten by this too. In addition to not decoding encoded words without whitespace after them, it throws an exception if there is a valid encoded word later in the string and the first encoded word is followed by something that isn't a hex number: >>> dec

[issue1079] decode_header does not follow RFC 2047

2009-04-08 Thread Atsuo Ishimoto
Atsuo Ishimoto added the comment: +1 for Tony's patch. This patch reverts fix for Issue1582282 filed by tkikuchi. I cannot understand the rationale for solution proposed in Issue1582282. How does the fix make easier to read mails from Entourage? -- nosy: +ishimoto, tkikuchi _

[issue1079] decode_header does not follow RFC 2047

2009-04-04 Thread Tony Nelson
Tony Nelson added the comment: The email package does not follow the RFCs in anything to do with header parsing or decoding. This is a known deficiency. So no, I am not thinking of atoms at all -- and neither is email.header.decode_header()! :-( Until email.header actually parses headers into

[issue1079] decode_header does not follow RFC 2047

2009-04-04 Thread R. David Murray
R. David Murray added the comment: Tony, I don't think I agree with your reading of the RFC. IMO, your inversion of test_rfc2047_without_whitespace is not correct. '=' is not a 'special' in RFC[2]822 terms, so the atom does not end at the apparent end of the encoded word. I say apparent becau

[issue1079] decode_header does not follow RFC 2047

2009-04-03 Thread Tony Nelson
Tony Nelson added the comment: I think the problem is best viewed as headers are not being parsed according to RFC2822 and decoded after that, so the recognition of encoded words should be looser, and not require whitespace around them, as it is not required in all contexts. Patch and test, tes

[issue1079] decode_header does not follow RFC 2047

2009-02-04 Thread Gabriel Genellina
Changes by Gabriel Genellina : -- nosy: +gagenellina ___ Python tracker ___ ___ Python-bugs-list mailing list Unsubscribe: http://mail

[issue1079] decode_header does not follow RFC 2047

2009-02-03 Thread Tom Lynn
Tom Lynn added the comment: The only difference between the two regexps is that the email/header.py version looks for:: (?=[ \t]|$) # whitespace or the end of the string at the end (with re.MULTILINE, so $ also matches '\n'). To expand on "There is nothing about that thing in RFC

[issue1079] decode_header does not follow RFC 2047

2007-09-16 Thread Sean Reifschneider
Sean Reifschneider added the comment: Can you provide an example of an address that triggers this? Preferably in a code sample that can be used to reproduce it? Uber-ideally, a patch to the email module test suite would be great. -- nosy: +jafo priority: -> normal ___

[issue1079] decode_header does not follow RFC 2047

2007-09-01 Thread Mickaël Guérin
New submission from Mickaël Guérin: email.header.decode_header expect a space or end of line after the end of an encoded word ("?="). There is nothing about that thing in RFC 2047. Python 2.5.1 ChangeLog seems to indicate that this bug has been solved. Unfortunately, the function still don't wor