Charles Plessy <ple...@debian.org> writes: > The problem here is that I have no comprehensive information on how > softwares use the mime.types files. I can not rule out that some use > case sensitivity for their own good reasons, so if no other bug arise, I > would like to continue to stick to the information provided by the IANA. > >> Those entries will behave differently for "see" and Python's >> "mimetypes.guess_type()". For instance "see" will consider "foo.sar" >> as application/vnd.sar, but "mimetypes.guess_type()" will not. > > I can not tell which approach is wiser... The mime.types file is not > comprehensive and is usually lagging. What if there is another file > format around that uses the lowercase `sar` extension?
I've looked at four other implementations, to find new behaviors. Emacs assumes that /etc/mime.types contains only lowercase extensions. When (mailcap-extension-to-mime ext) is called, ext is first downcased before being compared to the extensions in /etc/mime.types. So it will be unable to work with extensions like SAR, that are only listed in uppercase in /etc/mime.types. (mailcap-extension-to-mime "JPG") => "image/jpeg" (mailcap-extension-to-mime "jpg") => "image/jpeg" (mailcap-extension-to-mime "SAR") => nil (mailcap-extension-to-mime "sar") => nil Apache 2.4's mod_mime convert extensions to lowercase when reading them from /etc/mime.types and before looking them up. So it's the same case-insensitive behavior as for the "see" command mentioned in my previous mail. Go (https://golang.org/src/mime/type.go) keeps two maps, one where all extensions are stored as they are in /etc/mime.types, one where they are lowercased. Lookup is done case-sensitive first, then case-insensitive. That's a new behavior, compared to other tools. Mutt is doing a case insensitive comparison of each extension in /etc/mime.types again the end of the filename to check. So regarding case-sensitiveness it behaves like Python and see. However, contrary to Python and see, Mutt is able to deal with extensions that contains a dot in mime.types. (The only one listed in /etc/mime.types is "pcf.Z", and seems to work in Python and see only because these implementations have some hard-coded handling of .Z, .gz, and other similar extensions, so they actually do a lookup for "pcf", which has the same mime type in /etc/mime.types). In case of multiple matches, Mutt keeps the largest one. >> It would be nice to clarify the semantics in the comments at the top of >> mime.types. > > Definitely! I hope to do so or write a proper man page after I dig the > history of that file. On that topic, the comment at the top of the file: # Users can add their own types if they wish by creating a ".mime.types" # file in their home directory. Definitions included there will take # precedence over those listed here. should probably be rephrased to suggest that this is how applications are expected to work, but that not all of them will do. For instance Python and Go won't look at this file. -- Alexandre Duret-Lutz