Hi Bruno,
Bruno Haible <[email protected]> writes:
> But that's not the same thing: The previous code made sure that
> module names with non-ASCII characters use the MD5 hash code.
> The new code does not:
>
>>>> re.match(re.compile(r'^[abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789_]*$'),'abcäöü')
>
>>>> re.match(re.compile(r'^\w*$'),'abcäöü')
> <re.Match object; span=(0, 6), match='abcäöü'>
>
> If someone ever uses module names with non-ASCII characters, we do
> *not* want to pass these characters into shell variable names.
Oops, yes good point. I overlooked that. It looks like re.ASCII should
do the job [1].
In the documentation for \w it states [2]:
Matches [a-zA-Z0-9_] if the ASCII flag is used.
Collin
[1] https://docs.python.org/3/library/re.html#re.ASCII
[2] https://docs.python.org/3/library/re.html#regular-expression-syntax