On Thu, 11 Aug 2011 13:12:20 -0700, Mark S Bilk wrote: > When a newsreader requests article headers from a news server with an > OVER (formerly XOVER) command, the server sends the headers of each > article combined into a single line, with the contents of the various > headers separated by tabs, in this order (quoting from Section 8.3.2 of > RFC 3977 http://tools.ietf.org/html/rfc3977#section-8.3): > > "0" or article number Subject header content From header content > Date header content Message-ID header content References header > content :bytes metadata item :lines metadata item > > For all fields, the value is processed by first removing all CRLF > pairs (that is, undoing any folding and removing the terminating > CRLF) and then replacing each TAB with a single space. > > Since tabs separate the various header content fields of the line the > server sends to the newsreader, obviously those fields must not contain > any tabs. Otherwise the newsreader won't be able to count the tabs in > order to distinguish and separate the fields, and thus reconstruct the > headers. That's why RFC 3977 says that tabs in any of the header > contents must be replaced by spaces before those contents are combined > and sent to the newsreader. > > Unfortunately Astraweb doesn't replace those tabs with spaces. > > Tabs occur most frequently in the References header, when the article is > two or more followup levels below the original post. The Pan newsreader > is unable to retrieve, or sometimes even see, > articles from Astraweb whose References header is folded using tabs. > Most newsreaders don't post that way, but some do. > > That's why Pan couldn't download from Astraweb three of the five test > articles that I posted. And why, from Astraweb, > Pan can't see all the articles in deeply nested followup trees, and > sometimes sees a tree broken into smaller ones. > > This was all determined by capturing the stream of alt.test headers, > from Astraweb and from Blocknews, using the wireshark network packet > analyzer. Blocknews formats the header records properly, removing tabs > from the field values before combining them. > > I'll send another trouble-ticket to Astraweb, but after their previous > performance I don't know if they'll care. It shouldn't be hard to > insert code into Pan to interpret the incoming header stream defensively > and determine when the References field ends, > which is what KNode and other newsreaders must be doing, since they work > OK with Astraweb.
I already spotted this some time ago and modified pan to handle this correctly. You could try to compile my branch and try again. Cheers. -- roses are 0xff0000, violets are 0x0000ff, all my base are belong to you... _______________________________________________ Pan-users mailing list Pan-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/pan-users