I'm not quite following this and I'm not sure what exactly you're having a
problem with, but I'll try to explain some stuff that it looks like will
help.

Unicode(UTF-8) is a variable multi-byte format. Each character is
represented by 1 to 4 bytes, UTF-8 character number 0x108(264) happens to
require 2 bytes.

So, if anything is trying to read UTF-8 characters with the rules for a
single byte character set you will see the C Circumflex as two separate
characters. UTF-8 should display fine on web page if you set the charset
(like the previous emails incstructed you too) and the users OS supports
UTF-8 (Windows 2000 and XP do, probably the recent flavors of linux as
well).

ISO-8859-1 (Which is the most prevalent character set on the internet) is
single-byte. UTF-8 and ISO-8859-1 do share  characters, Every character from
0x00(0) to 0x7f(127) in ISO-8859-1 is a single byte character in UTF-8, with
the same nubmer. So if you are trying to display one of these characters in
the other format, it will appear normal, while all UTF-8 characters
0x80(128) and above will be different and not display correctly.

Are you just trying to store data to be able to retrieve it later?

A block of text submitted to a webform maintains the character set of the
webpage that submitted it (at least in my experience, there may be some
exceptions to this though it is dependant on the browser). So you shouldn't
have any problems displaying the data if the character set on the submitting
page and displaying page are identical.

If a character doesn't fall into a particular character set, there is no way
to store it. UTF-8 is a good catch all because it can handle just about any
character out there.


Plenty of information about Unicode can be found here:
http://www.unicode.org/

Chris

-----Original Message-----
From: Louie Miranda [mailto:[EMAIL PROTECTED]
Sent: Tuesday, December 02, 2003 9:21 PM
To: [EMAIL PROTECTED]
Subject: Re: [PHP] Re: Unicode translation


Yes, i just learned that windows uses a decimal code and there is a hex
value too.
Well, since some unicode characters dont have a decimal value, its really
getting harder to solve this kind of problems. :(


-- -
Louie Miranda
http://www.axishift.com


----- Original Message -----
"Luke" <[EMAIL PROTECTED]> wrote in message
news:[EMAIL PROTECTED]
> no, but on windows, you should be able to access most of them (depending
on
> the font) by using ALT+(number pad code)
> eg ALT+(num pad)0169 gives you -> ©
>
> Luke
>
> --
> Luke

--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to