If you want a regular expression that does match the example string, you
can use the \p{property} notation. For example:
> (regexp-match? #px"^\\p{L}+$" "h\uFFC3\uFFA9llo")
#t
The "Regexp Syntax" docs have a grammar for regular expressions with links
to examples.
Ryan
On Thu, Jul 9, 2020 at 4:32 PM Sorawee Porncharoenwase <
[email protected]> wrote:
> Racket REPL doesn’t handle unicode well. If you try (regexp-match?
> #px"^[a-zA-Z]+$" "héllo") in DrRacket, or write it as a program in a file
> and run it, you will find that it does evaluate to #f.
>
> On Thu, Jul 9, 2020 at 7:19 AM Peter W A Wood <[email protected]>
> wrote:
>
>> I was experimenting with regular expressions to try to emulate the Python
>> isalpha() String method. Using a simple [a-zA-Z] character class worked for
>> the English alphabet (ASCII characters):
>>
>> > (regexp-match? #px"^[a-zA-Z]+$" "hello")
>> #t
>> > (regexp-match? #px"^[a-zA-Z]+$" "h1llo")
>> #f
>>
>> It then dawned on me that the Python is alpha() method was Unicode aware:
>>
>> >>> "é".isalpha()
>> True
>>
>> I started scratching my head as how to achieve the equivalent using a
>> regular expression in Python. I tried the same regular expression with a
>> non-English character in the string. To my surprise, the regular expression
>> recognised the non-ASCII characters.
>>
>> > (regexp-match? #px"^[a-zA-Z]+$" "h\U+FFC3\U+FFA9llo")
>> #t
>>
>> Are Racket regular expression character classes Unicode aware or is there
>> some other explanation why this regular expression matches?
>>
>> Peter
>>
>> --
>> You received this message because you are subscribed to the Google Groups
>> "Racket Users" group.
>> To unsubscribe from this group and stop receiving emails from it, send an
>> email to [email protected].
>> To view this discussion on the web visit
>> https://groups.google.com/d/msgid/racket-users/2197C34F-165D-4D97-97AD-F158153316F5%40gmail.com
>> .
>>
> --
> You received this message because you are subscribed to the Google Groups
> "Racket Users" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to [email protected].
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/racket-users/CADcuegsvf-hFwofptc2ieKQmqWFzxDnD1Cn8G7bFSzBZ%2BM3EDA%40mail.gmail.com
> <https://groups.google.com/d/msgid/racket-users/CADcuegsvf-hFwofptc2ieKQmqWFzxDnD1Cn8G7bFSzBZ%2BM3EDA%40mail.gmail.com?utm_medium=email&utm_source=footer>
> .
>
--
You received this message because you are subscribed to the Google Groups
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/racket-users/CANy33q%3DtBkQYDg-Tv1MEw17P1ipnqUDcDDFmq_%3DTumUAGJrHAA%40mail.gmail.com.