On 22 March 2017 at 08:52, John Emmas <j...@creativepost.co.uk> wrote:
> Forgive my ignorance - this'll probably be obvious to some of you... > > Suppose I've got a simple character string, like this:- > > const char* my_str = "Hello World"; > > I can assign it to a Glib::ustring very easily:- > > Glib::ustring ustr = my_str; > > BUT... instead of pointing to a 'normal' string (simple ASCII characters), > let's suppose that 'my_str' was already pointing to a string in utf8 > format. Will the same assignment still work - or is there some better way > of assigning a utf8 string to a Glib::ustring? Thanks, > > John > UTF-8 is backwards compatible with ASCII. If bit 7 of any given byte in a string is 0, then that byte is treated as ASCII. Only if bit 7 is 1 do UTF-8-compatible tools start interpreting the lower bits and the following bytes differently. In the same way, to Glib::ustring, any char* is just a block of bytes for it to interpret as ASCII or as the extended set of characters supported by UTF-8. (This typically manifests as different behaviour when getting the string length, indexing, etc.: there is no longer a 1:1 correspondence between size in bytes and length in characters when UTF-8 encoding is in play.) IOW, the answer to the question is yes, the same assignment will/must work, and no, there is no better way: construct the Glib::ustring from the char* and let it handle the rest.
_______________________________________________ gtkmm-list mailing list gtkmm-list@gnome.org https://mail.gnome.org/mailman/listinfo/gtkmm-list