2008/5/12 Yannick Warnier <[EMAIL PROTECTED]>:
> Hello,
>
> I've been trying to find something nice to transform an accentuated
> string into a non-accentuated string. Obviously, I'm mostly playing
> inside the European languages, but any method that could transform
> arabic or asian characters to plain non-accentuated characters would be
> perfect.
>
> I have found a number of solutions, ranging from str_replace() for every
> known accentuated character to strtr() to a preg_replace() of a
> conversion of the string to html characters then removing the "&" and
> the "alteration" string (acute, grave, circ, ...).
>
> I must say the last one seems to work better because it's less affected
> by charset changes, but it still seems awfully slow to me and I would
> like to know if there is any function that exists that could do that for
> me?
>
> Yannick
>

Why are you removing the accents? Why not store/process the data as
UTF-8, which supports all the accents in all the languages, and even
non-latin languages. You mention Arabic, which does not use accented
latin characters (Maybe you are thinking of Turkish, Ubek or Tadjic).
UTF-8 supports Arabic, Russian, Greek, Latin including modified
accented letters, and almost everything else save CJK.

What is your end goal? Why are you removing the accents?

Dotan Cohen

http://what-is-what.com
http://gibberish.co.il
א-ב-ג-ד-ה-ו-ז-ח-ט-י-ך-כ-ל-ם-מ-ן-נ-ס-ע-ף-פ-ץ-צ-ק-ר-ש-ת

A: Because it messes up the order in which people normally read text.
Q: Why is top-posting such a bad thing?

Reply via email to