On Fri, 28 May 2010 11:13:35 -0400, tedd wrote:
> Bob wrtote:
>
>>>The real question is whether unicode is even relevant now that the UTF
>>>series is available.
>
> Ashley answered:
>
>>Bob, UTF is unicode (Unicode Transformation Format)
Or more precisely, UTF-{8,16,32} are different ways to
serialize Unicode code points into sequences of octets
that makes it possible to store and transmit Unicode
data.
> Yes, Ashley is correct. UTF-8 is Unicode, as is UTF-16 and UTF-32,
> which all use different a number of bytes for each code point. Both
> UTF-8 and UTF-16 are variable length whereas UTF-32 is a fixed length
> of four bytes per code point.
>
> As is my understanding, UTF-8 will accommodate all the languages
> (glyphs) of the world and then some. It will be a while before we
> need UTF-16 or UTF-32 but those are just a larger super-sets.
*blink*
They are all capable of representing the full Unicode
range, which is restricted to U+0000 - U+10ffff.
The theoretical limits are:
UTF-8 [0 - 7fffffff]
UTF-16 [0 - 10ffff]
UTF-32 [0 - ffffffff]
Also, there are many, many, *many* more glyphs than
characters (code point) in the world. As an example,
www.fonts.com lists 165,125 fonts. Every one has a
*different* glyph for the characer "A"...
/Nisse
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php