Charles Plessy <ple...@debian.org> writes:

> The problem here is that I have no comprehensive information on how
> softwares use the mime.types files.  I can not rule out that some use
> case sensitivity for their own good reasons, so if no other bug arise, I
> would like to continue to stick to the information provided by the IANA.
>
>> Those entries will behave differently for "see" and Python's
>> "mimetypes.guess_type()".   For instance "see" will consider "foo.sar"
>> as application/vnd.sar, but "mimetypes.guess_type()" will not.
>
> I can not tell which approach is wiser...  The mime.types file is not
> comprehensive and is usually lagging.  What if there is another file
> format around that uses the lowercase `sar` extension?


I've looked at four other implementations, to find new behaviors.


Emacs assumes that /etc/mime.types contains only lowercase extensions.
When (mailcap-extension-to-mime ext) is called, ext is first downcased
before being compared to the extensions in /etc/mime.types.  So it will
be unable to work with extensions like SAR, that are only listed in
uppercase in /etc/mime.types.

(mailcap-extension-to-mime "JPG") => "image/jpeg"
(mailcap-extension-to-mime "jpg") => "image/jpeg"
(mailcap-extension-to-mime "SAR") => nil
(mailcap-extension-to-mime "sar") => nil


Apache 2.4's mod_mime convert extensions to lowercase when reading them
from /etc/mime.types and before looking them up.  So it's the same
case-insensitive behavior as for the "see" command mentioned in my
previous mail.


Go (https://golang.org/src/mime/type.go) keeps two maps, one where all
extensions are stored as they are in /etc/mime.types, one where they
are lowercased.  Lookup is done case-sensitive first, then
case-insensitive.  That's a new behavior, compared to other tools.


Mutt is doing a case insensitive comparison of each extension in
/etc/mime.types again the end of the filename to check.  So regarding
case-sensitiveness it behaves like Python and see.

However, contrary to Python and see, Mutt is able to deal with 
extensions that contains a dot in mime.types.  (The only one
listed in /etc/mime.types is "pcf.Z", and seems to work in Python
and see only because these implementations have some hard-coded
handling of .Z, .gz, and other similar extensions, so they actually
do a lookup for "pcf", which has the same mime type in /etc/mime.types).
In case of multiple matches, Mutt keeps the largest one.

>> It would be nice to clarify the semantics in the comments at the top of
>> mime.types.
>
> Definitely! I hope to do so or write a proper man page after I dig the
> history of that file.

On that topic, the comment at the top of the file:

#  Users can add their own types if they wish by creating a ".mime.types"
#  file in their home directory.  Definitions included there will take
#  precedence over those listed here.

should probably be rephrased to suggest that this is how applications
are expected to work, but that not all of them will do.  For instance
Python and Go won't look at this file.

-- 
Alexandre Duret-Lutz

Reply via email to