-----BEGIN PGP SIGNED MESSAGE----- Hash: SHA1 Bruno Haible wrote: > Any volunteer wants to write a 'mbsfnmatch' function that works like fnmatch > but supports invalid byte sequences?
(I've removed bug-tar from the Cc list but left everyone else; I hope that's as it should be.) Wget is in need of such a facility as well: http://article.gmane.org/gmane.comp.web.wget.patches/2233 Or, possibly, a "c-fnmatch" would suit our needs more. Wget is currently locale/character set unaware; while we'd like to change that in the future, in the meantime we need things to "work" :) ... in any case, it could be a challenge to figure out the encoding used for remote filenames on an FTP server. What would be involved in writing such a facility? I might be interested in doing so, but need a clearer picture of what it would be. It appears, from looking at the current code, that the current mbs-handling fnmatch() simply converts the strings to wcs format, and then passes them to internal_fnwmatch(). One dead-simple approach would be that whenever an unrecognized byte is found, it is simply expanded to its wide-character version. This would end up doing the right thing if the locale is UTF-8 but the input string is in ISO-8859-1. It would be less functional for other encodings, including the other ISO-8859-* ones: character classification would be munged. OTOH, perhaps it's better not to let such characters be mapped to real wide characters at all, so that they'll work fine for * and ?, but fail all character-classification tests (or perhaps succeed at one specific one we've chosen for such cases). Perhaps the WEOF value (where available) could be used for this purpose (but care might be needed to ensure we don't pass it on to standard library functions). - -- Micah J. Cowan Programmer, musician, typesetting enthusiast, gamer... http://micah.cowan.name/ -----BEGIN PGP SIGNATURE----- Version: GnuPG v1.4.7 (GNU/Linux) Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org iD8DBQFHq6Np7M8hyUobTrERAhgjAJ9jhieW0x0UccmRYLNK6LZfW37EcACeO/gN mN8ASZrkx37VS6A2N8jHsLg= =RZnm -----END PGP SIGNATURE-----