On Thu, Jul 9, 2020 at 10:32 AM Sorawee Porncharoenwase < [email protected]> wrote:
> Racket REPL doesn’t handle unicode well. If you try (regexp-match? > #px"^[a-zA-Z]+$" "héllo") in DrRacket, or write it as a program in a file > and run it, you will find that it does evaluate to #f. > See this issue for workarounds, including installing the `readline-gpl` package: https://github.com/racket/racket/issues/3223 But you may have some other issues: for me, `(regexp-match? #px"^[a-zA-Z]+$" "h\U+FFC3\U+FFA9llo")` gives an error saying "read-syntax: no hex digit following `\U`" For the original question: > On Thu, Jul 9, 2020 at 7:19 AM Peter W A Wood <[email protected]> > wrote: > >> I was experimenting with regular expressions to try to emulate the Python >> isalpha() String method. >> > You'd want to benchmark, but, for this purpose, I have a hunch you might get better performance by using `in-string` with a `for/and` loop (which can use unsafe operations internally)—probably especially so if you were content to just test `char-alphabetic?`, which follows Unicode's definition of "alphabetic" rather that Python's idiosyncratic one. Here's an example: #lang racket > > (module+ test > (require rackunit)) > > (define (char-letter? ch) > ;; not the same as `char-alphabetic?`: see > ;; https://docs.python.org/3/library/stdtypes.html#str.isalpha > (case (char-general-category ch) > [(lm lt lu ll lo) #t] > [else #f])) > > (define (string-is-alpha? str) > (for/and ([ch (in-string str)]) > (char-letter? ch))) > > (module+ test > (check-true (string-is-alpha? "hello")) > (check-false (string-is-alpha? "h1llo")) > (check-true (string-is-alpha? "héllo"))) > -- You received this message because you are subscribed to the Google Groups "Racket Users" group. To unsubscribe from this group and stop receiving emails from it, send an email to [email protected]. To view this discussion on the web visit https://groups.google.com/d/msgid/racket-users/CAH3z3gYfZqGe5hQheAsSxdX8VzAsGYhi61W5ZpmMkkaRb0F%2B5A%40mail.gmail.com.

