Vajrasky Kok added the comment:
In Lib/re.py, starting from line 77 (Python 3.4):
\w Matches any alphanumeric character; equivalent to [a-zA-Z0-9_]
in bytes patterns or string patterns with the ASCII flag.
In string patterns without the ASCII flag, it will match the
range of Unicode alphanumeric characters (letters plus digits
plus underscore).
With LOCALE, it will match the set [0-9_] plus characters defined
as letters for the current locale.
The prelude is "Matches any alphanumeric character;".
Yet, in any case (bytes, string patterns with ascii flag, string patterns
without the ascii flag, strings with locale), the underscore is always included.
Then why don't we change the prelude to "Matches any alphanumeric character and
underscore character;"? In the description we explain the alphanumeric
depending on it's unicode or not can be [A-Za-z0-9] or wider than that.
The description is already okay but the prelude is misleading readers.
----------
_______________________________________
Python tracker <[email protected]>
<http://bugs.python.org/issue18779>
_______________________________________
_______________________________________________
Python-bugs-list mailing list
Unsubscribe:
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com