On 24 June 2017 at 19:12, Chris Vine <vine35792...@gmail.com> wrote: > On Sat, 24 Jun 2017 19:08:36 +0100 > Chris Vine <vine35792...@gmail.com> wrote: > > > It is because UTF-8 is a multibyte encoding, and any one character may > > require between 1 and 5 bytes to represent it. If you were allowed to > > change a byte at will you would be able to introduce invalid encoding > > sequences. As to the absense of documentation, maybe it is because > > this was thought to be self-evident, dunno. > > And I should perhaps also make the point that these operators return a > 32-bit unicode character, not a byte, which is consequent on the same > point. If you allowed mutation, the length of the string (in bytes) > might change.
Right, of course. It does seem very obvious now. It seemed to completely slip my mind that we're dealing with characters of arbitrary width, not e.g. UTF-16. :( Thanks for the comprehensive answer to a stupid question!
_______________________________________________ gtkmm-list mailing list gtkmm-list@gnome.org https://mail.gnome.org/mailman/listinfo/gtkmm-list