#33969: Improve django.core.mail.messages EAI processing
------------------------------+--------------------------------------
Reporter: j-bernard | Owner: nobody
Type: New feature | Status: closed
Component: Core (Mail) | Version: dev
Severity: Normal | Resolution: needsinfo
Keywords: EAI IDNA RFC | Triage Stage: Unreviewed
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
------------------------------+--------------------------------------
Comment (by Mike Edmunds):
I looked into this earlier today as part of ticket #35581, and was
surprised to find that IDNA 2003 is probably still the correct choice for
sending email. Documenting my findings here.
The problem is for domains containing one of the
[https://www.unicode.org/reports/tr46/#Deviations "deviation characters"]
where the two IDNA versions differ. For instance:
IDNA 2003: otto@faß.example → otto@fass.example
IDNA 2008: otto@faß.example → otto@xn--fa-hia.example
If those two domains are owned by different people, and Django uses a
different version of IDNA than Otto expects, Otto's email could go to the
wrong person. Big problem.
So the question is, what version of IDNA does Otto expect? Browsers have
all updated to IDNA 2008: if you enter http://faß.example, you will end up
at http://xn--fa-hia.example, not http://fass.example. (You can try this
with the .de equivalents to those domains, which are currently parked at
different registrars.)
I had assumed email should match the browsers, and be using IDNA 2008 by
now. (And I was thinking that Django's ''not'' using it for email
addresses was a serious security issue.) I was wrong.
In testing earlier today, I found both Gmail and Outlook.com are still
using IDNA 2003 for domains in address headers: both treat
otto@faß.example as otto@fass.example. (They might be using IDNA 2008,
but with UTS #46 "transitional processing" enabled, which retains the
IDNA 2003 encoding for the deviation characters.)
Bottom line: we wouldn't want to switch Django's sanitize_address() to use
IDNA 2008 encoding (at least not without transitional processing), because
''that'' would actually introduce a security issue, by sending Otto's
email to an unexpected domain.
Also, If I'm understanding correctly, part of the request here is to be
able to get Django's EmailMessage.message().as_string() to generate a
message that hasn't had ''any'' encoding applied to the addresses, for use
with SMTPUTF8. (That is, `To: jörg@faß.example` should stay just like
that, not turn into `To: [email protected]`.) I'm hoping
to address that as part of #35581, if
[https://docs.python.org/3/library/email.policy.html#email.policy.SMTPUTF8
`email.policy.SMTPUTF8`] is used for EmailMessage.message().
Note that Django's SMTP EmailBackend doesn't currently support SMTPUTF8.
That's probably best handled as a separate new feature request. (Or could
also be implemented by a third-party custom EmailBackend.)
--
Ticket URL: <https://code.djangoproject.com/ticket/33969#comment:4>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/01070190b3c060ba-29e07ce8-0ae7-4568-8952-3bbed645c1fe-000000%40eu-central-1.amazonses.com.