#36119: Attaching email file to email fails if the attachment is using 8bit
Content-Transfer-Encoding
-----------------------------+------------------------------------
     Reporter:  Trenton H    |                    Owner:  (none)
         Type:  Bug          |                   Status:  new
    Component:  Core (Mail)  |                  Version:  5.1
     Severity:  Normal       |               Resolution:
     Keywords:  compat32     |             Triage Stage:  Accepted
    Has patch:  0            |      Needs documentation:  0
  Needs tests:  0            |  Patch needs improvement:  0
Easy pickings:  0            |                    UI/UX:  0
-----------------------------+------------------------------------
Changes (by Mike Edmunds):

 * keywords:   => compat32
 * stage:  Unreviewed => Accepted

Comment:

 This came up [https://forum.djangoproject.com/t/using-emailmessage-with-
 an-attached-email-file-crashes-due-to-non-ascii/37981 in the forum].
 Here's a minimal case to reproduce:

 {{{#!python
 from django.core.mail import EmailMessage

 # Content of message to attach, using 8bit CTE with raw utf-8:
 att = """\
 Subject: attachment
 Content-Type: text/plain; charset=utf-8
 Content-Transfer-Encoding: 8bit

 ยก8-bit content!
 """.encode()

 email = EmailMessage(to=["[email protected]"])
 email.attach("attachment.eml", att, "message/rfc822")

 email.message().as_bytes()
 # ...python3.12/email/generator.py", line 409, in write
 #    self._fp.write(s.encode('ascii', 'surrogateescape'))
 #                   ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
 # UnicodeEncodeError: 'ascii' codec can't encode character '\xa1' in
 position 0: ordinal not in range(128)
 }}}

 The problem is that
 
[https://github.com/django/django/blob/5244ecbd2259365ecd6bbf96747285a673b2ee69/django/core/mail/message.py#L399-L402
 EmailMessage._create_mime_attachment()] uses
 `message_from_string(force_str(content))` to convert the attachment
 content to a Python Message object. But message_from_string() doesn't
 properly handle Unicode characters. (See
 [https://github.com/python/cpython/issues/83565#issuecomment-1093851962
 python/cpython#83565]. This dates back to Python 2 when strings couldn't
 include Unicode.)

 A working alternative is message_from_bytes():

 {{{#!python
 from email import message_from_string, message_from_bytes

 message_from_string(att.decode()).as_bytes()
 # UnicodeEncodeError: 'ascii' codec can't encode character '\xa1' in
 position 0: ordinal not in range(128)

 message_from_bytes(att).as_bytes()
 # b'Subject: attachment\nContent-Type: text/plain; charset=utf-8\nContent-
 Transfer-Encoding: 8bit\n\n\xc2\xa18-bit content!\n'
 }}}

 So the simplest fix is probably to change Django to use
 `message_from_bytes(force_bytes(content))`.

 (This would also be fixed by upgrading Django to use Python's modern email
 APIs, #35581.)
-- 
Ticket URL: <https://code.djangoproject.com/ticket/36119#comment:4>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion visit 
https://groups.google.com/d/msgid/django-updates/0107019489e41163-c7ba5290-d521-49bf-bb1b-19a01dc1ae94-000000%40eu-central-1.amazonses.com.

Reply via email to