Re: [PHP] Cannot even come up with the beginning of a regex

Dotan Cohen Thu, 28 Feb 2008 01:15:45 -0800

On 28/02/2008, Casey <[EMAIL PROTECTED]> wrote:
>  >  Thank you very much, Zoltan. Is there a known UTF-8 limitation?
>  >  Because it works fine for me in English letters (well, the opposite of
>  >  what I needed but I was able to work with it as which polar I start
>  >  with was arbitrary), but not in Hebrew letters. For instance, this
>  >  works as expected:
>  >
>  >  $test="aabacada aa a f";
>  >  $test=preg_replace('/\b([^\s]+)a\b.*/U', '$1A', $test);
>  >  print $test; // PRINTS aabacadA aA a f
>  >
>  >  However, this does not:
>  >
>  >  $test="אאבאגאדא אא א ";
>  >  $test=preg_replace('/\b([^\s]+)ע\b.*/U', '$1א', $test);
>  >  print $test; // PRINTS אאבאגאדא אא א
>  >
>  >  Am I misunderstanding something, or is there a UTF-8 problem, or
>  >  something else? Thank you for your assistance, it is much appreciated
>  >  and I'm learning what I can.
>
> The "a" character (97) is different from the "א" character (1488).
>
>  $a = html_entity_decode('&#1488;');
>
> $test=preg_replace('/\b([^\s]+)' . $a . '\b.*/U', '$1A', $test);
>
>
> Will this work?


No, it doesn't. I've been playing around a bit and learning, and it
looks like it really should work. With English letters it does. But
not with Hebrew.  You can see the result and the exact  code used
here:
http://gibberish.co.il/test.html

I appreciate the assistance. I'm certain that we're missing only some
small detail here.

Dotan Cohen

http://what-is-what.com
http://gibberish.co.il
א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Re: [PHP] Cannot even come up with the beginning of a regex

Reply via email to