Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Alexey Borzenkov
On 6/11/07, Alexey Borzenkov <[EMAIL PROTECTED]> wrote: > The problem is that I don't know if anything actually supports bit 11 > at the time and can't even tell if I did this correctly or not. :( I downloaded the latest WinZip and can confirm that it parses utf-8 filenames correctly (although it

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Alexey Borzenkov
On 6/11/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > For compatibility, I would propose to use UTF-8 only if the file > name is not ASCII. Even though the OEM code pages vary, they > are (mostly) ASCII supersets. So if the string can be encoded > in ASCII, there is no need to set the UTF-8 fl

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Martin v. Löwis
> But this is only on Windows! I have no clue what's the common > situation on other OSes and don't even know how to sanely get OEM > codepage on Windows (the obvious way with ctypes.kernel32.GetOEMCP() > doesn't seem good to me). > > So I guess that's bad idea anyway, maybe conforming to language

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Alexey Borzenkov
On 6/10/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > So the general idea is that at least directory filename has some sort > > of convention of using oem (dos, console) encoding on Windows, cp866 > > in my case. Header filenames have different encodings, and seem to be > > ignored. > Ok, th

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Martin v. Löwis
> So the general idea is that at least directory filename has some sort > of convention of using oem (dos, console) encoding on Windows, cp866 > in my case. Header filenames have different encodings, and seem to be > ignored. Ok, then this is what the zipfile module should implement. >> That woul

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Alexey Borzenkov
On 6/10/07, "Martin v. Löwis" <[EMAIL PROTECTED]> wrote: > > I don't think always encoding them to utf-8 (and using bit 11 of > > flag_bits) is a good idea, since there's a chance to create archives > > that won't be correctly readable by programs not supporting this bit > > (it's no secret that cu

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Martin v. Löwis
> I don't think always encoding them to utf-8 (and using bit 11 of > flag_bits) is a good idea, since there's a chance to create archives > that won't be correctly readable by programs not supporting this bit > (it's no secret that currently some programs just assume that > filenames are encoded us

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Alexey Borzenkov
> Current zipfile seems to officially support ascii filenames only > anyway, so the patch can be as simple as this: Submitted patch and test case as http://python.org/sf/1734346 ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mai

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Alexey Borzenkov
> > Also note that I'm trying to ask if zipfile should be improved, how it > > should be improved, and this possible improvement is not even for me > > (because now I know how zipfile behaves and I will work correctly with > > it, but someone else might stumble upon this very unexpectedly). > If yo

Re: [Python-Dev] zipfile and unicode filenames

2007-06-10 Thread Martin v. Löwis
> sys.setdefaultencoding() > exists for a reason, wouldn't it be better if stdlib could cope with > that at least with zipfile? sys.setdefaultencoding just does not work. Many more things break when you call it. It only exists because people like you insisted that it exists. > Also note that I'm

Re: [Python-Dev] zipfile and unicode filenames

2007-06-09 Thread Martin v. Löwis
> Today I've stumbled upon a bug in my program that wasn't very > straightforward to understand. Unfortunately, it isn't straight-forward to understand your description of it, either. > The problem is that I was passing > unicode filenames to zipfile.ZipFile.write and I had > sys.setdefaultencod

[Python-Dev] zipfile and unicode filenames

2007-06-09 Thread Alexey Borzenkov
Hi everyone, Today I've stumbled upon a bug in my program that wasn't very straightforward to understand. The problem is that I was passing unicode filenames to zipfile.ZipFile.write and I had sys.setdefaultencoding() in effect, which resulted in a situation where most of the bytes generated in zi