On Thu, Jul 9, 2020 at 10:32 AM Sorawee Porncharoenwase <
[email protected]> wrote:

> Racket REPL doesn’t handle unicode well. If you try (regexp-match?
> #px"^[a-zA-Z]+$" "héllo") in DrRacket, or write it as a program in a file
> and run it, you will find that it does evaluate to #f.
>
See this issue for workarounds, including installing the `readline-gpl`
package: https://github.com/racket/racket/issues/3223

But you may have some other issues: for me, `(regexp-match?
#px"^[a-zA-Z]+$" "h\U+FFC3\U+FFA9llo")` gives an error saying "read-syntax:
no hex digit following `\U`"

For the original question:


> On Thu, Jul 9, 2020 at 7:19 AM Peter W A Wood <[email protected]>
> wrote:
>
>> I was experimenting with regular expressions to try to emulate the Python
>> isalpha() String method.
>>
>
You'd want to benchmark, but, for this purpose, I have a hunch you might
get better performance by using `in-string` with a `for/and` loop (which
can use unsafe operations internally)—probably especially so if you were
content to just test `char-alphabetic?`, which follows Unicode's definition
of "alphabetic" rather that Python's idiosyncratic one. Here's an example:

#lang racket
>
> (module+ test
>   (require rackunit))
>
> (define (char-letter? ch)
>   ;; not the same as `char-alphabetic?`: see
>   ;; https://docs.python.org/3/library/stdtypes.html#str.isalpha
>   (case (char-general-category ch)
>     [(lm lt lu ll lo) #t]
>     [else #f]))
>
> (define (string-is-alpha? str)
>   (for/and ([ch (in-string str)])
>     (char-letter? ch)))
>
> (module+ test
>   (check-true (string-is-alpha? "hello"))
>   (check-false (string-is-alpha? "h1llo"))
>   (check-true (string-is-alpha? "héllo")))
>

-- 
You received this message because you are subscribed to the Google Groups 
"Racket Users" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/racket-users/CAH3z3gYfZqGe5hQheAsSxdX8VzAsGYhi61W5ZpmMkkaRb0F%2B5A%40mail.gmail.com.

Reply via email to