Re: UTF-8 / German, Scandinavian letters - is it really this difficult?? Linux & Windows XP

Fuzzyman Tue, 22 Feb 2005 06:52:03 -0800

Max M wrote:
> Fuzzyman wrote:
> > Mike Dee wrote:
>
> >>#!/usr/bin/env python
> >># -*- coding: UTF-8 -*-
>
> > This will mean string literals in your source code will be encoded
as
> > UTF8 - if you handle them with normal string operations you might
get
> > funny results.
>
> It means that you don't have to explicitely set the encoding on
strings.
>
> If your coding isn't set you must write:
>
> ust = '���'.decode('utf-8')
>


Which is now deprecated isn't it ? (including encoded string literals
in source without declaring an encoiding).

> If it is set, you can just write:
>
> ust = u'���'
>
> And this string will automatically be utf-8 encoded:
>
> st = '���'
>
> So you should be able to convert it to unicode without giving an
encoding:
>
> ust = unicode(st)
>

So all your non unicode string literals will be utf-8 encoded. Normal
string operations will handle them with the default encoding, which is
likely to be something else. A likely source of confusion, unless you
handle everything as unicode.

But then I suppose if you have any non-ascii characters in your source
code you *must* be explicit about what encoding they are in, or you are
asking for trouble.

Regards,


Fuzzy
http://www.voidspace.org.uk/python/index.shtml

> --
>
> hilsen/regards Max M, Denmark
> 
> http://www.mxm.dk/
> IT's Mad Science

--
http://mail.python.org/mailman/listinfo/python-list

Re: UTF-8 / German, Scandinavian letters - is it really this difficult?? Linux & Windows XP

Reply via email to