On Sat, Nov 5, 2016 at 10:55 AM, Jovan Trujillo
<[email protected]> wrote:
> Hi Aaron,
> In perlre I read that \w
> "
>
> \w [3] Match a "word" character (alphanumeric plus "_", plus
> other connector punctuation chars plus
> Unicode
> marks)
>
> "
>
> So since I didn't know what these 'other' connection punctuation chars are I
> avoided it. Unicode makes things more complicated for me. Do you know?
>
To exclude Unicode and ensure only ASCII, use the /a modifer,
eg, /\w+/a
>From perlre:
/a
is the same as "/u", except that "\d", "\s", "\w", and the Posix
character classes are restricted to matching in the ASCII range only.
That is, with this modifier, "\d" always means precisely the digits "0"
to "9"; "\s" means the five characters "[ \f\n\r\t]"; "\w" means the 63
characters "[A-Za-z0-9_]"; and likewise, all the Posix classes such as
"[[:print:]]" match only the appropriate ASCII-range characters.
--
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]
http://learn.perl.org/