Dave <[EMAIL PROTECTED]> posted [EMAIL PROTECTED], excerpted below, on Tue, 10 Oct 2006 01:02:38 +0100:
> On Saturday 07 October 2006 11:45, Duncan wrote: >> pan-attach and pan-attach-kd are scripts designed to allow posting >> attachments with pan. URL below. > > Duncan, I had some comments from a couple of users that they are just > seeing the raw encoded textual information when I post using your script. > One is using KNode/0.10.4, the other Thunderbird (Mozilla 4.8 [en] > (Windows NT 5.0; U)) > > Agent and Pan (at least) are both ok with it. > > I got the same comments when I encoded manually and pasted the result in a > while back too. > > Any ideas? I don't even know what I should be looking for :-( AFAIK, neither Thunderbird nor KNode (nor OE for that matter) do yEnc. yEnc (My Encoding, without the M and abbreviating encoding as enc, that's the proper capitalization, BTW) is the newest encoding, and the most popular in many groups despite the fact that a lot of clients don't understand it, as it's only ~5% overhead (compared to the traditional 33% overhead, four bytes encoded transmits 3 bytes of binary file), taking advantage of the fact that news is close to 8-bit clean. (Note that if news was entirely 8-bit clean, there'd be no need to encode at all -- you'd simply post the files and others would simply download them as they do with HTTP and FTP.) The normally accepted rule is that the poster chooses how he'll post, since he's the one providing the content, while the downloaders get it if they can cope with it, and get ignored if they can't. Again, posters tend to choose yEnc because they can either post that much more in the same time or bandwidth, or use less time/bandwidth to post the same content. Downloaders who complain about gibberish where there should be attachments are told to get a decent news client. By now, if it can't do yEnc, it's not considered an acceptable news client for binaries, period. I suspect that's what you were running into -- clients that don't do yEnc. Both pan and agent do yEnc -- that is in fact one of the points in their favor and against any clients that don't. About the others: UUE is the oldest of the three encodings. As with yEnc however, it hasn't gone thru the full Internet RFC standards process as has MIME, and as it sort of came to be without a standard or definition of any sort (people wanted a way to attach binaries to an otherwise text-only medium, and they experimented with various things until UUE came into being). UUE BTW stands for Unix-to-Unix Encoding (I believe I have that correct) -- it was developed in the Internet formative years when nearly all the machines on the Internet were Unix machines, either US-DOD or University based, well before our current Internet mail or news standards were fully defined. As mentioned above, the encoding overhead is 33%, four bytes encoded is three bytes of file. However, UUE is mail as well as news safe and given its age, nearly every client that handles attachments at all handles UUE. While it isn't a choice here for reasons explained below, for completeness, I'll cover MIME here as well. MIME, Multi-purpose Internet Mail Extensions, is actually a broad set of formally defined standards, aspects of which are used in many other areas as well. Among other aspects, most Unix "file-type associations" are based on the MIME file-types from this standard. The same set of file-types is used to define HTTP/web server file-types as well, as I believe it was Apache that first borrowed them for that purpose. Heading back to MIME as used in Internet Message standards, the framework actually defines two different types of 7-bit ASCII text encoding (as used in Internet Mail messages, the standards of which formed the basis for news as well). MIME/base64 is similar to UUE but using a slightly different defined 64 characters as their encoding base. This is what is referred to when we talk of MIME encoding in the context of binary file attachments. The other encoding is quoted-printable, which is very close to plain text and is designed to handle primarily text content as effectively as possible while still allowing the raw encoding to be for the most part human-readable. If you ever come across a message that has =3D for equals signs and similar =XX hexidecimal codes for certain other characters, that's very likely either raw MIME/quoted-printable or a message that started out as MIME/quoted-printable but got corrupted in some way such that the MIME headers aren't recognized, so the client treats it as regular 7-bit ASCII text instead of MIME. The two MIME encoding formats are designed to be convertible directly one into the other, but quoted-printable is most efficient with text where it remains mostly human readable, while base64 is most efficient with binary, at again the standard 4-bytes encoded text encodes 3-bytes of binary file, a 33% overhead. Because MIME is the only formally/officially defined and standardized attachment method for binaries, nearly all modern clients understand it. The only major exceptions are very old clients that were around pre-MIME standard and were never upgraded to comply with it. As with UUE, it's both mail and news safe. The reason it isn't a choice for pan-attach(-kd) is due to the way the standards are implemented. It's a full framework, defining a specific header (MIME-Version: 1.0) that must appear in any MIME compliant message, with additional headers defining the number of parts and how they are layed out in the message, and each part containing its own set of part-headers. In ordered to properly do MIME, therefore, the MIME encoder must have control of the entire post, in ordered to define all the headers appropriately. pan only forwards the message body to the defined external editor, and even if it forwarded the entire message at that point, there's no guarantee that further changes wouldn't be made after pan got the message back, therefore potentially invalidating some of the headers declared by the external editor. Put directly, the only way to properly do MIME is to have pan do it all. Since that's not possible at my skillset level and therefore the level that pan-attach(-kd) is implemented in, pan-attach can't properly do mime and therefore doesn't have that choice. I could go on in some detail about MIME as it's something I've studied in some depth (well, at least to the point of reading the main RFCs (Requests for Comments, the way these documents start) on the subject, as I had a reason to do so at one point and they /were/ rather fascinating, at least to me). However, I'll leave it there as the mail is really rather long as it is. I'd certainly encourage others interested in understanding reading all about the standards that form the basis of the Internet we all use, however, to read up on these. They aren't nearly as dry and devoid of interest as wading thru EULAS is, for example, and the MIME RFCs are some of the more "mere human" accessible of the RFCs. Google MIME RFCs for a good start. In the interest of completeness, I should mention the other "encoding" choice that pan-attach(-kd) does have, text/identity. Here "encoding" is in quotes, simply because there /is/ none -- it simply includes the file as-is. As such, it's neither news-safe nor mail-safe to use this for any binary format files at all. It WILL break things, either corrupting the message itself, or at minimum the attached file, if a binary file is attached in text/identity mode. In most cases, I'd not expect the post to even make it to the server successfully as it breaks all the rules. So what is it good for then? Simply this: use this choice if you have a (7-bit ASCII normal) text file you want to include as-is. It avoids the encoding overhead entirely, and will be remain readable as text. In effect, this is what pan already does with the sig file if you simply point it at a text file -- includes it as text. What's all that come down to in simple form? 1) Choose yEnc encoding if you are posting binary files and care more about efficiency than download client compatibility (and if you are using old-pan, which can do so, new-pan can't). 2) Choose text/identity encoding for text files that you want displayed as part of the message itself, not as attachments. 3) Choose UUE if you are posting binaries and are either worried about compatibility or are using new-pan, which chokes on yenc, leaving UUE the only binary posting choice. Also, where the message will be gated to a mail, choose UUE, as yEnc WILL break with mail. 4) Expect a certain level of complaints if you choose yEnc, as some people continue to insist on using now binary group inappropriate clients that don't handle yenc (inappropriate as yEnc is now the most popular choice for posting binaries, and users using clients that can't grok it on binary newsgroups simply need to throw away what might as well be their manual typewriters and join the age of the computer and Internet). There's two additional things to keep in mind as well. 1) The guy who originally defined yEnc specified certain conditions (having to do with ordering and requiring the keyword yEnc as part of the subject line) for the subject lines of posts containing yEnc encoded files. Of course, pan-attach(-kd) can't enforce this, but the user can manually, if desired. However, most clients that understand yEnc don't require the strict subject line formatting and will recognize it even without it, and in fact a lot of yEnc posts don't strictly observe the subject line requirements in any case. There might be a few that won't work unless the requirements are strictly met, however. For the technical details of yEnc including the subject requirements, see the yEnc home page here: http://www.yenc.org/ (and note that the common short form without the www, simply yenc.org, doesn't work in this case). 2) As UUE was never formally standardized, there are occasional minor differences in implementation. In the vast majority of cases, these won't matter much and things are compatible, but one might come upon a case that's not. Since there's no formal spec to break, one can't properly say such implementations are "broken", only that they aren't 100% compatible with the way everybody else implements UUE. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman _______________________________________________ Pan-users mailing list Pan-users@nongnu.org http://lists.nongnu.org/mailman/listinfo/pan-users