On 03/12/24 17:59, Marc Haber wrote:
in preparation for a PRECIS future, couldn't adduser pass the usernames
through NFC instead of doing no normalization?
RFC 8264 5.2.4 Normalization Rule states:
In accordance with [RFC5198], Normalization Form C (NFC) is
RECOMMENDED.
that would solve the étienne and étienne issue (where the two characters
are just different renderings of the same character), but not the
Ohm-against-Omega issue, right?
NFC would solve both of these "problems":
* Both U+00E9 (é) and U+0065, U+0301 are NFC-normalized to U+00E9,
* Both U+2126 (Ohm sign) and U+0349 (omega) are NFC-normalized to U+0349
(omega).
What NFC alone will not solve are homograph collisions: a (U+0061 Latin
small letter a) and а (U+0430 Cyrillic small letter a) are
NFC-normalized to different codepoints.
But these are two different scenarios: the former problem may (and does)
arise without any wrongdoing from the user's side (a different OS, or a
different string manipulation library, or a screen keyboard may produce
a different é), the latter is an attack. The former is an
interoperability issue, the latter is a security issue.
While this seems the right thing to do, I think this should be done in
useradd (pkg:shadow), in the respective upstream project, so that all
Linux distributions get the same behavior.
That's probably the best approach.
Thanks for taking the time to delve into this issue,
--
Gioele Barabucci