I finally solved the problem, after ditching the /b argument in
preg_replace It seems that Hebrew letters are not part of a word, so
far as /b is concerned. Here is the code for converting Hebrew final
letters in the middle of a word to normal letters, if anyone ever
needs it:

$text="עברית מבולגנת";

function hebrewNotWordEndSwitch ($from, $to, $text) {
   $text=preg_replace('/'.$from.'([א-ת])/u','$2'.$to.'$1',$text);
   return $text;
}

do {
   $text_before=$text;
   $text=hebrewNotWordEndSwitch("ך","כ",$text);
   $text=hebrewNotWordEndSwitch("ם","מ",$text);
   $text=hebrewNotWordEndSwitch("ן","נ",$text);
   $text=hebrewNotWordEndSwitch("ף","פ",$text);
   $text=hebrewNotWordEndSwitch("ץ","צ",$text);
}   while ( $text_before!=$text );

print $text; // עברית מסודרת!

The do-while is necessary for multiple instances of letters, such as
"אנני" which would start off as "אןןי". Note that there's still the
problem of acronyms with gershiim but that's not a difficult one to
solve.

Dotan Cohen

http://what-is-what.com
http://gibberish.co.il
א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Reply via email to