python 3.1 unicode question

2009-09-15 Thread jeffunit

I wrote a program that diffs files and prints out matching file names.
I will be executing the output with sh, to delete select files.

Most of the files names are plain ascii, but about 10% of them have unicode
characters in them. When I try to print the string containing the name, I get
an exception:

'ascii' codec can't encode character '\udce9'
in position 37: ordinal not in range(128)

The string is:

'./Julio_Iglesias-Un_Hombre_Solo-05-Qu\udce9_no_se_rompa_la_noche.mp3'

This is on a windows xp system, using python 3.1 which I compiled 
with the cygwin

linux compatability layer tool.

Can you tell me what encoding I need to print \udce9 and how to set python to
that encoding mode?

thanks,
jeff

--
http://mail.python.org/mailman/listinfo/python-list


Re: python 3.1 unicode question

2009-09-15 Thread jeffunit

At 09:25 PM 9/15/2009, Mark Tolonen wrote:
"jeffunit"  wrote in message 
news:[email protected]...

I wrote a program that diffs files and prints out matching file names.
I will be executing the output with sh, to delete select files.

Most of the files names are plain ascii, but about 10% of them have unicode
characters in them. When I try to print the string containing the name, I get
an exception:

'ascii' codec can't encode character '\udce9'
in position 37: ordinal not in range(128)

The string is:

'./Julio_Iglesias-Un_Hombre_Solo-05-Qu\udce9_no_se_rompa_la_noche.mp3'

This is on a windows xp system, using python 3.1 which I compiled
with the cygwin
linux compatability layer tool.

Can you tell me what encoding I need to print \udce9 and how to set python to
that encoding mode?


That looks like a "surrogate escape" (See PEP 383) 
http://www.python.org/dev/peps/pep-0383/.  It indicates the wrong 
encoding was used to decode the filename.


That seems likely. How do I set the encoding to something correct to 
decode the filename?


Clearly windows knows how to display it.
I suspect since I complied python with cygwin, that it is using a 
POSIX standard,
rather than a windows specific standard. Of course ideally, I would 
like my code to work
on linux as well as windows, as I back up all of my data to a linux 
machine with

samba.

thanks,
jeff

--
http://mail.python.org/mailman/listinfo/python-list