Re: git archive --format zip utf-8 issues

2012-09-24 Thread Junio C Hamano
René Scharfe writes: > "git 1" is the patch "archive-zip: support UTF-8 paths" added, which > let's archive-zip make use of the UTF-8 flag. "git 2" is "git 1" plus > the patch "archive-zip: declare creator to be Unix for UTF-8 > paths". Both have been posted before. "git 3" is "git 1" plus the

Re: git archive --format zip utf-8 issues

2012-09-24 Thread René Scharfe
Hi, I found a way to make unzip respect the UTF-8 flag in ZIP files: Apparently (from looking at the source) an extended field needs to be present in order for it to even look at general purpose flag 11. I sent a patch to add an extended timestamp field that fits the bill. Here are new numb

Re: git archive --format zip utf-8 issues

2012-09-20 Thread René Scharfe
Am 18.09.2012 23:12, schrieb Junio C Hamano: René Scharfe writes: WindowsInfo-ZIP unzip 7-Zip PeaZip builtin Linux msysgit Windows 7-Zip 9.20 0 0 4626 43 43 PeaZip 4.7.1 win6

Re: git archive --format zip utf-8 issues

2012-09-18 Thread Junio C Hamano
René Scharfe writes: > WindowsInfo-ZIP unzip > 7-Zip PeaZip builtin Linux msysgit Windows > 7-Zip 9.20 0 0 4626 43 43 > PeaZip 4.7.1 win64 0 0 4626

Re: git archive --format zip utf-8 issues

2012-09-18 Thread René Scharfe
Hello again, so two weeks have passed, and I've moved at a glacial pace towards a method how to measure compatibility of our generated ZIP files. Sorry, I just keep getting distracted. Anyway, the idea is to have a bunch of files with names using different scripts, zip them with several pac

Re: git archive --format zip utf-8 issues

2012-09-05 Thread René Scharfe
Am 04.09.2012 23:03, schrieb Junio C Hamano: René Scharfe writes: + if (has_non_ascii(path)) { Do we want to treat \033 as "ascii" in this codepath? The function primarily is used by the log formatter to see if we need 8-bit CTE when writing out in the e-mail format. Argh, yes, I'd t

Re: git archive --format zip utf-8 issues

2012-09-04 Thread Junio C Hamano
René Scharfe writes: > But now for the patch, which is a bit confusing as well. I'm curious to > hear about results for more platforms, extractors and character classes. > Based on that we can see if we need to generate the extra fields instead > of relying on the new flag. Thanks for keeping t

Re: git archive --format zip utf-8 issues

2012-09-04 Thread René Scharfe
Am 31.08.2012 00:26, schrieb Jeff King: > Ping on this stalled discussion. Sorry, I got distracted by other stuff again. I did some experiments, though, and here's a preliminary result. > It seems like there are two separate issues here: > >1. Knowing the encoding of pathnames in the reposi

Re: git archive --format zip utf-8 issues

2012-08-30 Thread Jeff King
On Sat, Aug 11, 2012 at 11:37:05PM +0200, Sven Strickroth wrote: > Am 11.08.2012 22:53 schrieb René Scharfe: > > The standard says we need to convert to CP437, or to UTF-8, or provide > > both versions. A more interesting question is: What's supported by which > > programs? > > > > The ZIP func

Re: git archive --format zip utf-8 issues

2012-08-11 Thread Junio C Hamano
René Scharfe writes: > ... A more interesting question is: What's supported by > which programs? Yes, that is the most interesting question. >> Of course, "git archive --format=zip --path-reencode=utf8-to-latin1" >> would be the most generic way to do this. > > I really hope we can make do with

Re: git archive --format zip utf-8 issues

2012-08-11 Thread Junio C Hamano
René Scharfe writes: >> PKZIP APPNOTE seems to be the zip standard and it specifies a utf-8 >> flag: http://www.pkware.com/documents/casestudies/APPNOTE.TXT >>> A. Local file header: >>> general purpose bit flag: (2 bytes) >>> Bit 11: Language encoding flag (EFS). If this bit is >>> set, the fi

Re: git archive --format zip utf-8 issues

2012-08-11 Thread Sven Strickroth
Am 11.08.2012 22:53 schrieb René Scharfe: > The standard says we need to convert to CP437, or to UTF-8, or provide > both versions. A more interesting question is: What's supported by which > programs? > > The ZIP functionality built into Windows 7 doesn't seem to work with > UTF-8 encoded file

Re: git archive --format zip utf-8 issues

2012-08-11 Thread René Scharfe
Am 11.08.2012 01:53, schrieb Sven Strickroth: Am 11.08.2012 00:47 schrieb Junio C Hamano: Do you know in what encoding the pathnames are _expected_ to be stored in zip archives? re-encoding to latin1 does not always work and may break double byte totally (e.g. chinese or japanese). PKZIP APPN

Re: git archive --format zip utf-8 issues

2012-08-11 Thread René Scharfe
Am 11.08.2012 00:47, schrieb Junio C Hamano: Sven Strickroth writes: when I create a git repository, add a file containing utf-8 characters or umlauts (like öäü.txt), commit and then export the HEAD revision to a zip archive using "git archive --format zip -o 1.zip HEAD", the zip file contains

Re: git archive --format zip utf-8 issues

2012-08-10 Thread Sven Strickroth
Am 11.08.2012 00:47 schrieb Junio C Hamano: > Do you know in what encoding the pathnames are _expected_ to be > stored in zip archives? re-encoding to latin1 does not always work and may break double byte totally (e.g. chinese or japanese). PKZIP APPNOTE seems to be the zip standard and it specif

Re: git archive --format zip utf-8 issues

2012-08-10 Thread Junio C Hamano
Sven Strickroth writes: > when I create a git repository, add a file containing utf-8 characters > or umlauts (like öäü.txt), commit and then export the HEAD revision to a > zip archive using "git archive --format zip -o 1.zip HEAD", the zip file > contains incorrect filenames: My reading of arc

git archive --format zip utf-8 issues

2012-08-10 Thread Sven Strickroth
Hi, when I create a git repository, add a file containing utf-8 characters or umlauts (like öäü.txt), commit and then export the HEAD revision to a zip archive using "git archive --format zip -o 1.zip HEAD", the zip file contains incorrect filenames: $ unzip -l 1.zip Archive: 1.zip 4490a6dab1df5