how to get size of unicode string/string in bytes ?
Hello, how can I get the number of byte of the string in python? with "len(string)", it doesn't work to get the size of the string in bytes if I have the unicode string but just the length. (it only works fine for ascii/latin1) In data structure, I have to store unicode string for many languages and must know exactly how big of my string which is stored so I can read back later. Many thanks for any suggestion. cheers! pattreeya. -- http://mail.python.org/mailman/listinfo/python-list
Re: how to get size of unicode string/string in bytes ?
e.g. I use utf8 as encoding/decoding,
s = "ทดสอบ"
u = s.decode("utf-8")
how can I get size of u ?
[EMAIL PROTECTED] schrieb:
> Hello,
>
> how can I get the number of byte of the string in python?
> with "len(string)", it doesn't work to get the size of the string in
> bytes if I have the unicode string but just the length. (it only works
> fine for ascii/latin1) In data structure, I have to store unicode
> string for many languages and must know exactly how big of my string
> which is stored so I can read back later.
>
> Many thanks for any suggestion.
>
> cheers!
> pattreeya.
--
http://mail.python.org/mailman/listinfo/python-list
Re: how to get size of unicode string/string in bytes ?
I got the answer. What I need was so simple but I was blinded at that
moment.
Thanks for any suggestion!
f = open("test.csv", rb)
t1 = f.readline()
>>> t2 = t1.decode("iso-8859-9") # test with turkish
>>> t2
u'Dur-kalk trafi\u011fi, t\u0131kan\u0131kl\u0131k tehlikesi\n'
>>> print t2
Dur-kalk trafigi, tikaniklik tehlikesi
>>> len(t2)
39
>>> t2 = t1.decode("iso-8859-9")
>>> t2
u'Dur-kalk trafi\u011fi, t\u0131kan\u0131kl\u0131k tehlikesi\n'
>>> print t2
Dur-kalk trafigi, tikaniklik tehlikesi
>>> len(t2)
39
>>> u1 = t2.encode("utf-8")
>>> u1
'Dur-kalk trafi\xc4\x9fi, t\xc4\xb1kan\xc4\xb1kl\xc4\xb1k tehlikesi\n'
>>> print u1
Dur-kalk trafigi, tikaniklik tehlikesi
>>> len(u1)
43
>>>
Thnx!
--
http://mail.python.org/mailman/listinfo/python-list
