tags 436566 + upstream thanks Am Mittwoch, den 08.08.2007, 10:57 +0200 schrieb Thomas de Grivel:
Hello Thomas, > UTF-8 multi-byte characters are not correctly encoded into URLs. These > characters are for example vowels with accents, and thus appear very > frequently in european languages like French (which is my own). > > Although UTF-8 encoded web pages are not widespread yet, I believe it is > a good practice to encourage unicode. Here is an example website which > fails with lftp : > > > $ lftp http://files.iai.heig-vd.ch/Enseignement/ <<EOF > > cd Supports%20de%20cours/Acquisition\ de\ données\ \&\ CEM/ > > EOF > > Here is the output I get : > > $ lftp http://files.iai.heig-vd.ch/Enseignement/Supports%20de%20cours/ > > cd ok, cwd=/Enseignement/Supports de cours > > lftp files.iai.heig-vd.ch:/Enseignement/Supports de cours> > > cd Acquisition\ de\ données\ \&\ CEM/ > > cd: Access failed: 404 Not Found (/Enseignement/Supports de > > cours/Acquisition de données & CEM) > > lftp files.iai.heig-vd.ch:/Enseignement/Supports de cours> exit > > I wrote a naïve patch to url-encode some of these characters and it > seems to work for the example page, but it still misses most UTF-8 > characters. While I figure out how to do it correctly maybe you can > point me to some relevant information or to upstream coders which > would be interested ? I forwarded your report with patch to the upstream mailinglist which the lftp author reads frequently. [EMAIL PROTECTED] I set the reply-to this bug, the lftp mailinglist and you, so you should get the answer. > Oh and thank you for maintaining this =) :) Thanks for your contribution. -- Noèl Köthe <noel debian.org> Debian GNU/Linux, www.debian.org
signature.asc
Description: Dies ist ein digital signierter Nachrichtenteil