Victor Stinner <victor.stin...@gmail.com> wrote: > > 'c' -> UCS1 > > 'u' -> UCS2 > > 'w' -> UCS4 > > A Unicode string is an array of code point. Another approach is to > expose such string as an array of uint8/uint16/uint32 integers. I > don't know if you expect to get a character / a substring when you > read the buffer of a string object. Using Python 3.2, I get: > > >>> memoryview(b"abc")[0] > b'a' > > ... but using Python 3.3 I get a number :-)
Yes, that's changed because officially (see struct module) the format is unsigned bytes, which are integers in struct module syntax: >>> unsigned_bytes = memoryview(b"abc") >>> unsigned_bytes.format 'B' >>> char_array = unsigned_bytes.cast('c') >>> char_array.format 'c' >>> char_array[0] b'a' Possibly the uint8/uint16/uint32 integer approach that you mention would make more sense. Stefan Krah _______________________________________________ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com