On Wed, Aug 18, 2010 at 11:38:42AM +0200, Brian White wrote:
> > I will systematically prove that (depending on semantics) either this
> > is outright false, or that the term "encoding" has been
> > misappropriated and, in the context of MIME, excluding these on the
> > basis that they are "encodings" is completely inappropriate.
> 
> At one point (the point at which they were removed from this file),
> including them would break Apache.

That's probably an apache bug, but I think you're confusing the issue.
MIME standards apply to e-mail, not HTTP, and as such Apache is
irrelevant to this discussion.  

HTTP has its own set of standards which are MIME-like, but are
seperate and different.  There have been issues caused by content
encodings configured *in Apache's own config file*, like this one:

  http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=565626

However, this is for HTTP, not MIME (i.e. e-mail). The MIME standards
have no such concept, and many e-mail clients have no means to deal
with this because there is no *MIME* standard that describes it.  

But I don't see why adding these MIME types -- which many other OSes
do include in their system-wide mime.types file, including many other
Linux distros -- should break Apache.  AFAIK it does not use
mime.types, and relies on its own configuration files to determine
MIME type handling.  So again, I think you're confusing the issue.

> > A file that has been gzipped contains data, and is a particular
> > type of file: a compressed archive.
> 
> Gzip, unlike tar or zip, is not an archive.  It is one file.  

The fact that it contains only one file in no way makes it not an
archive.  The primary purpose of data compression historically was to
make it smaller for storage: i.e. archival.  Besides which, a gzipped
file need not contain only one file; see below.

> It has the same features as something that is "uuencoded".

False.  Something uuencoded has been re-encoded in a different format
expressly for the purpose of *transfering* it via mechanisms which
require 7-bit ASCII (SMTP).  That is completely untrue for gzipped
content, and in fact you can not transfer gzipped files via SMTP
until the gzip archive has been "encoded" yet again to facilitate
that process using a transfer encoding defined by the MIME standards.

> In addition, gzip does not change the extension but merely adds a .gz
> extension.  

You're making assumptions that are false.  That's the default behavior
of the gzip program when run against a single file (which even in that
mode can be overridden).  This is not the only mode in which gzip
works.  One can run it like so, for instance:

  tar xvf - . | gzip -c > my_archive.gz

What type of data is that?  You need not use tar... any arbitrary
program can feed data to gzip in this manner, and there need not be a
specific file type associated with it.  Thus again, you do not
have any way to determine what type of data is in the gzip archive.
All that MIME knows is that this file is a gzipped archive -- except
in Debian it doesn't, because you've removed  the MIME types from the
system mime.types file.  So it knows absolutely nothing.

> Thus, both the type and the encoding are available in the filename.

Clearly false from the above.

> There is a way to know; use content-type and content-encoding, just as HTTP
> does.

Please show me where in the MIME standards this is allowed.
Content-encoding is not a valid MIME header, as far as I'm aware.
Content-transfer-encoding is, but gzip is not a valid content transfer
encoding defined by the MIME standards, and in fact it can not be one,
because attachments "encoded" in gzip can not be sent as-is over SMTP.

> > it is only suitable for MIME handlers to treat such files as
> > application data, which can be processed by the gzip program, as
> > MIME was always intended to be used.
> 
> Processing by the gzip program is equally useless on its own.  You're just
> changing one stream of bytes for another.

This is essentially true, but providing a proper MIME type prevents
e-mail clients from mishandling gzip files, for example by attaching
them with an incorrect MIME type (or none at all) or attempting to
display them as plain text when their mime-type lookup fails.

> > Finally, the MIME standards dictate that files not marked with an
> > associated Content-Type (MIME type) be asssumed to be plain text (RFC
> > 2045 sect. 5.2).  This clearly is unacceptable for gzipped data, as it
> > is quite certainly invalid plain text.  As such, it *must* have a
> > suitable MIME type associated with it.
> 
> It's the sender that sets the content-type of a MIME attachment.  Adding a
> mapping to the receiving machine will not change this.

This decision often is automatic based on the contents of the
system's mime.types file.  The user can override it (or even provide
his own), assuming he knows enough to do so.  Many do not -- as the
package maintainer, you should do this for them so they don't have to
know about it. 

> Despite the problems, mapping ".gz" to a gzip type gains nothing.  Clicking
> on the file will not display the file because gzip will not display
> anything.

It gains something very important: it prevents e-mail clients from
mishandling gzipped files.

> However, other programs (like Apache) may be smart enough to
> properly deal with the encoding and pass this information to the
> clients and creating the entry for which you ask could break a great
> many web-servers around the world.

Again, Apache is irrelevant, because MIME standards do not apply to
HTTP.  For purposes of MIME, gzip is NOT an encoding.  MIME is not
HTTP, and HTTP is not MIME.

-- 
Derek D. Martin
http://www.pizzashack.org/
GPG Key ID: 0x81CFE75D

Attachment: pgp7yTcSEBOea.pgp
Description: PGP signature

Reply via email to