On Tue, Dec 10, 2024 at 02:52:05PM +0100, Gioele Barabucci wrote:
> NFC has been mentioned in a broader discussion on PRECIS/RFC8264/RFC8265.
> 
> The IdentifierClass of RFC 8264 explicitly disallows all these "security
> land mines": https://www.rfc-editor.org/rfc/rfc8264.html#section-4.2.3
> 
> The "Security considerations" section is quite extensive (5 pages long):
> https://www.rfc-editor.org/rfc/rfc8264.html#section-12

Oh, good.  I was just getting worried when discussion on the list
seemed to be treating NFC as a silver bullet, and people were
suggesting that the canonicalization should be done both by readers
and writers of /etc/passwd --- which would imply linking libunicode
into setuid programs like sudo and login, with the (to my view)
invevitable results of hilarity ensuing.

As I look at RFC 8264, I note that it does not take a position about
which version of Unicode should be considered canonical, and in fact
talks about one of the features (tm) of RFC 8264 being that it is
agile with respect to newer versions of Unicode.

However, it should be noted that RFC 8264 also states that code points
which are not defined in whatever version of the Unicode supported by
"the application" shall be disallowed.  From Debian's perspective,
though, if we are going to take a position about what version of
Unicode should be supported by "the application(s)" that read and
write /etc/passwd, we *will* need to take a position on what version
of Unicode should be supported, and therefore, what set of characters
will be disallowed.

It also means that we need to be careful about what happens when we
want to upgrade to newer versions of Unicode in future versions of
Debian.  If the system administrator wants to support more than one
version of Debian, then it would be advisable if the Unicode version
is something which is configurable, especially if the passwd entries
are being supplied via some kind of network protocol such as LDAP or
Hesiod (for those people who remember MIT Project Athena :-P).

There is also (admittedly, only on edge case) of what to do if a newer
version of Unicode disallows or remove characters.  This rarely
happens, but it has in the past (in particular in the case of various
security disasters, or in the case of characters getting deprecated in
favor of newer characters, many of which are mentioned in RFC 8264).
So we can probably just ignore this case and hope that the Unicode
consortium will be more careful in the future, but I'd thought I'd
just mention it.

The bottom line is that while I am sympethetic to the desire to
support Unicode --- heck, I was one of the primary drivers of
libunicode into the kernel so we could support case folding for more
than just the ASCII character set --- the meme of "One does not simply
walk into Morder" also applies for "adopting Unicode".

And I am reminded of one of my IETF mentors who was an
Iternationalization expert tell me two decades ago that, late at
night, in the bar after a standard meeting, one of the things that
I18N folks would say, just amongst themselves, was, "It would be
easier just to teach everyone English" --- and this was with I18N
experts who understood everything that was involved in doing full I18N
support.  No doubt this was only half-joking, but I think the point is
valid.

So if we're going to do this, let's do it right.  :-)

                                          - Ted

Reply via email to