-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Karl Berry wrote: > | "C escapes" means to use the backslash character as escape character. > | This is a particularly bad choice, because - as you know - on some > systems, > | backslahes are used as directory separator. > > That hardly seems an insurmountable objection to me, since > (1) many such paths will not have any special characters (such as : or a > control character), and therefore will not be quoted, and therefore the > \'s will just appear as-is, and > (2) for those paths which are quoted, the \'s just get doubled. Big deal. > > Using \ is soooooo conventional in these situations. I'd find it very > strange for the coding standards to recommend uri-style % escapes. Of > course it will suffice, any quoting mechanism will suffice, what is > *natural* for compiler-like error messages in our world? \. IMHO.
Heartily agreed. Bruno Haible: > The thing that you want to quote are filenames and URLs. Filenames and > URLs are special cases of URIs. The syntax of URIs is defined in RFC 3986 [1]. > It uses the percent character as escape character. I don't believe it's accurate to claim that _filenames_ are special cases of URIs (RFC 3986 ยง2.5 refers to filenames as "local names", which may need transformation in order to be represented as URIs). Certainly, pathnames aren't, as URIs use forward-slashes, always, to separate URI components (in which case we'd have no need to worry about backslashes any longer). > The use of URI syntax rather than backslash-escaping is also more > understandable > to humans, because all users who use a browser eventually see a > percent-escaped > URL in the main text field. Whereas backslash-escaping is known only to > programmers, a small minority among the users. I couldn't agree less. I believe there are many more people who will grok something like "\"My\"\ File" than people who can make sense of "%22My%22%20File", especially if they have some idea of what the naked filename looked like. C escapes have the distinct advantage over URI percent-encoding that most escaped characters still give a direct expression of what the original character was. " -> \". The only characters that end up hex- or octal-encoded, are ones that wouldn't have been readable in the first place (and even then, not all of those: NL -> \n, FF -> \f) There's also the fact that URI percent-encoding is context-sensitive; in particular, %2F and / do not have the same meaning (the first would be used for a literal / _within_ a path component; the latter to separate path components); and a question mark within the path would need to be encoded, whereas the same delimiting or existing within a query string should not be. All that being said, I think representing URLs and other URIs in anything other than percent-encoding is begging for major confusion. That _is_ the accepted encoding mechanism for encoding characters within URIs. So routines that know that their arguments are URIs, ought to percent-encode them (but then, of course, they should already _be_ properly encoded, else they're not valid URIs). Routines that know only that it's a "filename", or that know nothing at all, are much better off IMO to use some other quoting mechanism, and c-style quoting seems an awfully good choice (in no small part because, as Karl says, it's "soooooo conventional."). - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHtN6B7M8hyUobTrERAltpAJ4uste1DmgUw8GnJC0aIk8T5+w/SwCfX0uV CyYXtt/F4JUq/Bye3ZvCflM= =sptW -----END PGP SIGNATURE-----