#36119: Attaching email file to email fails if the attachment is using 8bit
Content-Transfer-Encoding
-----------------------------+------------------------------------
Reporter: Trenton H | Owner: (none)
Type: Bug | Status: new
Component: Core (Mail) | Version: 5.1
Severity: Normal | Resolution:
Keywords: compat32 | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-----------------------------+------------------------------------
Changes (by Mike Edmunds):
* keywords: => compat32
* stage: Unreviewed => Accepted
Comment:
This came up [https://forum.djangoproject.com/t/using-emailmessage-with-
an-attached-email-file-crashes-due-to-non-ascii/37981 in the forum].
Here's a minimal case to reproduce:
{{{#!python
from django.core.mail import EmailMessage
# Content of message to attach, using 8bit CTE with raw utf-8:
att = """\
Subject: attachment
Content-Type: text/plain; charset=utf-8
Content-Transfer-Encoding: 8bit
ยก8-bit content!
""".encode()
email = EmailMessage(to=["[email protected]"])
email.attach("attachment.eml", att, "message/rfc822")
email.message().as_bytes()
# ...python3.12/email/generator.py", line 409, in write
# self._fp.write(s.encode('ascii', 'surrogateescape'))
# ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
# UnicodeEncodeError: 'ascii' codec can't encode character '\xa1' in
position 0: ordinal not in range(128)
}}}
The problem is that
[https://github.com/django/django/blob/5244ecbd2259365ecd6bbf96747285a673b2ee69/django/core/mail/message.py#L399-L402
EmailMessage._create_mime_attachment()] uses
`message_from_string(force_str(content))` to convert the attachment
content to a Python Message object. But message_from_string() doesn't
properly handle Unicode characters. (See
[https://github.com/python/cpython/issues/83565#issuecomment-1093851962
python/cpython#83565]. This dates back to Python 2 when strings couldn't
include Unicode.)
A working alternative is message_from_bytes():
{{{#!python
from email import message_from_string, message_from_bytes
message_from_string(att.decode()).as_bytes()
# UnicodeEncodeError: 'ascii' codec can't encode character '\xa1' in
position 0: ordinal not in range(128)
message_from_bytes(att).as_bytes()
# b'Subject: attachment\nContent-Type: text/plain; charset=utf-8\nContent-
Transfer-Encoding: 8bit\n\n\xc2\xa18-bit content!\n'
}}}
So the simplest fix is probably to change Django to use
`message_from_bytes(force_bytes(content))`.
(This would also be fixed by upgrading Django to use Python's modern email
APIs, #35581.)
--
Ticket URL: <https://code.djangoproject.com/ticket/36119#comment:4>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/django-updates/0107019489e41163-c7ba5290-d521-49bf-bb1b-19a01dc1ae94-000000%40eu-central-1.amazonses.com.