On Fri, 14 Jan 2005 21:02:21 +0100, Marc Lehmann writes: > At least one of your files (...) contains a character (.) which > is not allowed inside an IMG SRC tag. See the official URI syntax specs at: > http://http://www.ietf.org/rfc/rfc2396.txt > section 2.4.3. URIs may not contain delimiters such as <, >, #, %, \" or > white space. iGal can rename all your files to suppress or replace these > characters. > >The author would be well-advised to actually read the rfc that is being >referenced, as the very same rfc explains how to encode unsafe >characters.
your snooty report offends. your holier-than-thou affection *might* be acceptable if you had provided a patch for the problem. i see no patch here. >The message >is confusing because a program limitation (igal cannot correctly encode uris) >is misinterpreted as a principal limitation. yes and no. for ascii-only urls, you're right: #-encoding things like <, >, # etc. is safe. i'll add code for escaping these to igal. but if your url was iso-8859-1 and outside ascii (eg. äöüß etc), then you're wrong: there is a fundamental limitation making all character encodings in urls a tricky endeavour, as urls don't transport their own character encoding information. http://www.w3.org/TR/html40/appendix/notes.html#h-B.2.1 *suggests* conversion to utf-8 and then a % encoding for urls, and mentions the older practice of using just iso-8859-1 and its %-encoding. not all webservers distinguish properly between these two cases and the mechanism also depends on the web server filesystem (whether it wants to see iso-8859-1 filenames or whether unicode is expected). >It is true that IMG SRC cannot contain spaces (For example), but this >does in no way mean that image filenames were at fault. this is silly. the image filename is unrepresentable -> the filename poses the problem. igal confronts you with a problem report. "blödian.jpg" can be represented as "bl%F6ian.jpg" or "bl%C3%B6dian.jpg". both are legal, both are possible, both have been or are in use out there, either of them will work or fail in a specific situation. for example, apache on my debian box likes the first and doesn't grok the latter. so, which of two evils do you want igal to choose? i think that suggesting the safe course (ie. to avoid the charset trouble) to the user is actually a reasonable approach. having said that, i'll think about it a bit more and maybe add both common encodings to the list of choices. regards az -- + Alexander Zangerl + DSA 42BD645D + (RSA 5B586291) Rex is to Regina as Vax is to... -- Vadim Vygonets
pgpCYdhspFuSx.pgp
Description: PGP signature