#28628: Audit for and abolish all use of '\d' in regexes
-------------------------------------+-------------------------------------
Reporter: James Bennett | Owner: Ad
Type: | Timmering
Cleanup/optimization | Status: assigned
Component: Core (Other) | Version: dev
Severity: Normal | Resolution:
Keywords: | Triage Stage: Accepted
Has patch: 1 | Needs documentation: 0
Needs tests: 1 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Ad Timmering):
I went through each of the use cases in the Django code - and found little
reason/benefit to update most.
Important background to this is that in all unicode matches for `\d`,
`int(x)` actually properly casts the number back to a normal `int`. Most
cases where a decimal is expected and extracted, it will be casted with
`int(x)` so the problem goes away -- or might actually be beneficial to
users (eg. I live in Japan where people frequently use full-width
decimals, such as 012345 instead of 012345). Eg.
{{{
>>> int('\uABF9') # MEETEI MAYEK DIGIT NINE
9
}}}
Most use cases to me seem to fall in one of the below:
a) We're processing user input which ''might'' actually be (inadvertently)
input in non-ASCII; so result is likely desired - and the very least
changing it could mean it's a braking change for users. ==> DON'T CHANGE
b) Changing to a more restrictive regex seems harmless enough, but also
doesn't add much value. Eg. when parsing a version number like "1.2.3"
with something like `(\d)\.(\d)\.(\d)`.
c) To me there was only one case of Django code where it might be
beneficial to change, which is in `django.utils.http` processing of
dates/times in HTTP headers - and the spec clearly requires ASCII digits.
Inventory of use cases with thoughts [https://docs.google.com/document/d
/1nc1uwTIghm-eIhiIlssNAH72KNFHoHAsL0gGlr9dRlg/edit# in this Google doc].
Curious to thoughts of others.
--
Ticket URL: <https://code.djangoproject.com/ticket/28628#comment:15>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/069.a06d63f8d0c2f953c275ba41be24c875%40djangoproject.com.