On Wed, Aug 18, 2010 at 11:38:42AM +0200, Brian White wrote: > > I will systematically prove that (depending on semantics) either this > > is outright false, or that the term "encoding" has been > > misappropriated and, in the context of MIME, excluding these on the > > basis that they are "encodings" is completely inappropriate. > > At one point (the point at which they were removed from this file), > including them would break Apache.
That's probably an apache bug, but I think you're confusing the issue. MIME standards apply to e-mail, not HTTP, and as such Apache is irrelevant to this discussion. HTTP has its own set of standards which are MIME-like, but are seperate and different. There have been issues caused by content encodings configured *in Apache's own config file*, like this one: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=565626 However, this is for HTTP, not MIME (i.e. e-mail). The MIME standards have no such concept, and many e-mail clients have no means to deal with this because there is no *MIME* standard that describes it. But I don't see why adding these MIME types -- which many other OSes do include in their system-wide mime.types file, including many other Linux distros -- should break Apache. AFAIK it does not use mime.types, and relies on its own configuration files to determine MIME type handling. So again, I think you're confusing the issue. > > A file that has been gzipped contains data, and is a particular > > type of file: a compressed archive. > > Gzip, unlike tar or zip, is not an archive. It is one file. The fact that it contains only one file in no way makes it not an archive. The primary purpose of data compression historically was to make it smaller for storage: i.e. archival. Besides which, a gzipped file need not contain only one file; see below. > It has the same features as something that is "uuencoded". False. Something uuencoded has been re-encoded in a different format expressly for the purpose of *transfering* it via mechanisms which require 7-bit ASCII (SMTP). That is completely untrue for gzipped content, and in fact you can not transfer gzipped files via SMTP until the gzip archive has been "encoded" yet again to facilitate that process using a transfer encoding defined by the MIME standards. > In addition, gzip does not change the extension but merely adds a .gz > extension. You're making assumptions that are false. That's the default behavior of the gzip program when run against a single file (which even in that mode can be overridden). This is not the only mode in which gzip works. One can run it like so, for instance: tar xvf - . | gzip -c > my_archive.gz What type of data is that? You need not use tar... any arbitrary program can feed data to gzip in this manner, and there need not be a specific file type associated with it. Thus again, you do not have any way to determine what type of data is in the gzip archive. All that MIME knows is that this file is a gzipped archive -- except in Debian it doesn't, because you've removed the MIME types from the system mime.types file. So it knows absolutely nothing. > Thus, both the type and the encoding are available in the filename. Clearly false from the above. > There is a way to know; use content-type and content-encoding, just as HTTP > does. Please show me where in the MIME standards this is allowed. Content-encoding is not a valid MIME header, as far as I'm aware. Content-transfer-encoding is, but gzip is not a valid content transfer encoding defined by the MIME standards, and in fact it can not be one, because attachments "encoded" in gzip can not be sent as-is over SMTP. > > it is only suitable for MIME handlers to treat such files as > > application data, which can be processed by the gzip program, as > > MIME was always intended to be used. > > Processing by the gzip program is equally useless on its own. You're just > changing one stream of bytes for another. This is essentially true, but providing a proper MIME type prevents e-mail clients from mishandling gzip files, for example by attaching them with an incorrect MIME type (or none at all) or attempting to display them as plain text when their mime-type lookup fails. > > Finally, the MIME standards dictate that files not marked with an > > associated Content-Type (MIME type) be asssumed to be plain text (RFC > > 2045 sect. 5.2). This clearly is unacceptable for gzipped data, as it > > is quite certainly invalid plain text. As such, it *must* have a > > suitable MIME type associated with it. > > It's the sender that sets the content-type of a MIME attachment. Adding a > mapping to the receiving machine will not change this. This decision often is automatic based on the contents of the system's mime.types file. The user can override it (or even provide his own), assuming he knows enough to do so. Many do not -- as the package maintainer, you should do this for them so they don't have to know about it. > Despite the problems, mapping ".gz" to a gzip type gains nothing. Clicking > on the file will not display the file because gzip will not display > anything. It gains something very important: it prevents e-mail clients from mishandling gzipped files. > However, other programs (like Apache) may be smart enough to > properly deal with the encoding and pass this information to the > clients and creating the entry for which you ask could break a great > many web-servers around the world. Again, Apache is irrelevant, because MIME standards do not apply to HTTP. For purposes of MIME, gzip is NOT an encoding. MIME is not HTTP, and HTTP is not MIME. -- Derek D. Martin http://www.pizzashack.org/ GPG Key ID: 0x81CFE75D
pgp7yTcSEBOea.pgp
Description: PGP signature