Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-27 Thread Jakub Narębski
W dniu 2014-05-16 19:05, Junio C Hamano pisze: > Jakub Narębski writes: > >>> Correct, but is "where does it appear" the question we are >>> primarily interested in, wrt this breakage and its fix? >> >> That of course depends on how we want to test gitweb output. >> The simplest solution, compari

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-16 Thread Junio C Hamano
Jakub Narębski writes: >> Correct, but is "where does it appear" the question we are >> primarily interested in, wrt this breakage and its fix? > > That of course depends on how we want to test gitweb output. > The simplest solution, comparing with known output with perhaps > fragile / variable e

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-16 Thread Junio C Hamano
(sorry if you receive a dup; pobox.com seems to be constipated right now) Jakub Narębski writes: >> Correct, but is "where does it appear" the question we are >> primarily interested in, wrt this breakage and its fix? > > That of course depends on how we want to test gitweb output. > The simples

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-16 Thread Jakub Narębski
On Fri, May 16, 2014 at 3:26 AM, Junio C Hamano wrote: > Jakub Narębski writes: >> On Thu, May 15, 2014 at 9:38 PM, Junio C Hamano wrote: >>> Jakub Narębski writes: >>> Writing test for this would not be easy, and require some HTML parser (WWW::Mechanize, Web::Scraper, HTML::Query, pQ

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-15 Thread Junio C Hamano
Jakub Narębski writes: > On Thu, May 15, 2014 at 9:38 PM, Junio C Hamano wrote: >> Jakub Narębski writes: >> >>> Writing test for this would not be easy, and require some HTML >>> parser (WWW::Mechanize, Web::Scraper, HTML::Query, pQuery, >>> ... or low level HTML::TreeBuilder, or other low lev

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-15 Thread Jakub Narębski
On Thu, May 15, 2014 at 9:38 PM, Junio C Hamano wrote: > Jakub Narębski writes: > >> Writing test for this would not be easy, and require some HTML >> parser (WWW::Mechanize, Web::Scraper, HTML::Query, pQuery, >> ... or low level HTML::TreeBuilder, or other low level parser). > > Hmph. Is it mor

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-15 Thread Junio C Hamano
Jakub Narębski writes: > Writing test for this would not be easy, and require some HTML > parser (WWW::Mechanize, Web::Scraper, HTML::Query, pQuery, > ... or low level HTML::TreeBuilder, or other low level parser). Hmph. Is it more than just looking for a specific run of %xx we would expect to

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-15 Thread Jakub Narębski
On Thu, May 15, 2014 at 9:28 PM, Jakub Narębski wrote: > On Thu, May 15, 2014 at 8:48 PM, Michael Wagner wrote: [...] >> The subroutine "git tree" generates the tree view. It stores the output >> of "git ls-tree -z ..." in an array named "@entries". Printing the content >> of this array yields th

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-15 Thread Jakub Narębski
On Thu, May 15, 2014 at 8:48 PM, Michael Wagner wrote: > On Thu, May 15, 2014 at 10:04:24AM +0100, Peter Krefting wrote: >> Michael Wagner: >> >>>Decoding the UTF-8 encoded file name (again with an additional print >>>statement): >>> >>>$ REQUEST_METHOD=GET >>>QUERY_STRING='p=notes.git;a=blob_pla

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-15 Thread Michael Wagner
On Thu, May 15, 2014 at 10:04:24AM +0100, Peter Krefting wrote: > Michael Wagner: > > >Decoding the UTF-8 encoded file name (again with an additional print > >statement): > > > >$ REQUEST_METHOD=GET > >QUERY_STRING='p=notes.git;a=blob_plain;f=work/G%C3%83%C2%BCtekriterien.txt;hb=HEAD' > > ./gitwe

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-15 Thread Junio C Hamano
Peter Krefting writes: > What is happening is that whatever is generating the URI us > UTF-8-encoding the string twice (i.e., it generates a string with the > proper C3 BC in it, and then interprets it as iso-8859-1 data and runs > that through a UTF-8 encoder again, yielding the C3 83 C2 BC sequ

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-15 Thread Jakub Narębski
On Thu, May 15, 2014 at 7:08 AM, Michael Wagner wrote: > On Thu, May 15, 2014 at 12:25:45AM +0200, Jakub Narębski wrote: >> On Wed, May 14, 2014 at 11:57 PM, Junio C Hamano wrote: >>> Michael Wagner writes: >>> Perl has an internal encoding used to store text strings. Currently, tryin

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-15 Thread Peter Krefting
Michael Wagner: Decoding the UTF-8 encoded file name (again with an additional print statement): $ REQUEST_METHOD=GET QUERY_STRING='p=notes.git;a=blob_plain;f=work/G%C3%83%C2%BCtekriterien.txt;hb=HEAD' ./gitweb.cgi work/Gütekriterien.txt Content-disposition: inline; filename="work/Gütekriter

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-14 Thread Michael Wagner
On Thu, May 15, 2014 at 12:25:45AM +0200, Jakub Narębski wrote: > On Wed, May 14, 2014 at 11:57 PM, Junio C Hamano wrote: > > Michael Wagner writes: > > > >> Perl has an internal encoding used to store text strings. Currently, > >> trying to > >> view files with UTF-8 encoded names results in an

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-14 Thread Jakub Narębski
On Wed, May 14, 2014 at 11:57 PM, Junio C Hamano wrote: > Michael Wagner writes: > >> Perl has an internal encoding used to store text strings. Currently, trying >> to >> view files with UTF-8 encoded names results in an error (either "404 - Cannot >> find file" [blob_plain] or "XML Parsing Erro

Re: [PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-14 Thread Junio C Hamano
Michael Wagner writes: > Perl has an internal encoding used to store text strings. Currently, trying to > view files with UTF-8 encoded names results in an error (either "404 - Cannot > find file" [blob_plain] or "XML Parsing Error" [blob]). Converting these UTF-8 > encoded file names into Perl's

[PATCH/RFC] Gitweb: Convert UTF-8 encoded file names

2014-05-14 Thread Michael Wagner
Perl has an internal encoding used to store text strings. Currently, trying to view files with UTF-8 encoded names results in an error (either "404 - Cannot find file" [blob_plain] or "XML Parsing Error" [blob]). Converting these UTF-8 encoded file names into Perl's internal format resolves these e