Joe Zeff posted on Mon, 10 Sep 2012 13:46:08 -0700 as excerpted: > On 09/10/2012 01:23 PM, Thufir wrote: >> did you guys decide to go forward with a rudimentary html parser, and, >> if so, what's the timeline, please? >> >> > Judging by what I've read here in the past, coding will start on the > Twelfth of Never.
<typical Duncan =:^)>That's correct, but not entirely complete.</typical> Quoting a famous line, "It depends what the meaning of 'is' is." Or in this case, what the meaning of "rudimentary html parser" is. It's absolutely true that there's close to zero (I take it Thufir's an exception, the only one /I/ know of) interest on the list for display of HTML in the "web browser display" sense. There's an EXTREMELY strong sense of "If you want to display an HTML formatted page, use a browser; if you want to post HTML, use the web and post a link if you want/need to." Pan's a "pimp-ass newsreader", not a web browser, and implementing HTML/XML display both properly and securely takes an IMMENSE amount of resources. If the user trusts the HTML/XML enough, they can always save it and open in a dedicated web browser, which hopefully has enough development resources to implement HTML/XML display both properly and securely, since UNLIKE a news client such as pan, that's what a web client DOES. However, there HAS been discussion of the possibility of implementing a simple "dumb tag stripper" mode (which will actually need to be reasonably smart if it's not to mistake "meta-observation" tags such as those in my first paragraph for HTML/XML tags and strip them), much as claws-mail for example does to /surprisingly/ good effect. The idea is to "simply" strip out any HTML/XML tags, leaving the plain text. But as I said, it's not that simple, really. In addition to "meta observation" tags such as those I used above, anchor tags (used for links) arguably shouldn't be stripped entirely, simply stripped of their HTML, leaving the description of the link and the URL as plain text. Image tags are another question. Do you treat them like anchor tags and strip the HTML but convert the alt-text and the URL to plain text, or strip them entirely? Claws strips them entirely, figuring enough of them are ads and the like, that it's better without them. After all, one can always open the HTML message, presented as an attachment, in a web client, if it's considered trustworthy enough and worth the hassle. Personally, I was rather negative on this whole idea, until I saw how effectively claws-mail implemented it (after having little choice but to switch to /something/ other than the kmail I'd been using for nearly a decade, when they akonadified and broke everything, tho I've now discovered claws to be a better fit for my usage anyway, so can sort of thank the kdepim folks for the push much as I can thank the MS folks for the push to Linux they gave me with eXPrivacy). And claws-mail, like pan, is gtk-based. I don't remember whether it's C+ + based as pan now is, or C based, but regardless, it's likely their impressively effective implementation could at least provide some hints to anyone wishing to try to code up a similar solution for pan, even if the code isn't in practice either simply liftable, or better yet, reimplemented in a library both could share (possibly along with sylpheed, which claws forked from, and who knows what other apps could make use of it?). But I'm not a coder, and even if I was, while it'd be nice, unless the claws implementation could be dropped into pan nearly as-is, I strongly suspect I'd find more "itchy" itches to scratch. And I know of no one else specifically taking up that project either, tho for all I know it's possible someone's going to announce their previously private project tomorrow, saying here's a beta, test it to pieces! So rehashing, I don't believe anyone's seriously interested in pan having a proper HTML display mode. That's a SERIOUS bit of CONTINUOUS work that even dedicated browser projects have trouble pulling off both properly and with continuous security, there's NO WAY something like pan could do it, without SERIOUSLY affecting its ability to maintain and improve its primary intended functionality as a "pimp-ass newsreader". And even if it could be done reasonably well, the result would no longer be pan, it'd be some other product. And if you want something that's not pan, go find it and use it, or create it. Don't try to make pan into something it's not, and never can be, without destroying what pan /is/. But a rather simpler (tho still not simple) HTML-to-plain-text mode has been discussed. We know it can work and DOES work impressively well in claws-mail. But to my knowledge, there's nobody actually working on such a thing, and realistically, unless the claws implementation could be very nearly dropped whole into pan with little additional work, I don't find it particularly likely that anyone with the necessary skills AND interested enough in pan to go to the trouble, finds that feature a high enough priority to ever actually get it done. But unlike the full fledged browser-style HTML/XML parser, this one's at least reasonable in theory, and wouldn't so drastically change pan that it would no longer be pan and people might as well just use something different to start with. All IMO of course, but with all humility, I guess it's worth /something/ after a decade (in a couple months I believe, actually, November 10 or so should be my 10-year first-post anniversary IIRC from looking it up on gmane, a few months ago, I'll have to check again as the date gets closer, and maybe go out for dinner that day or something =:^) of helping on the pan lists. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman _______________________________________________ Pan-users mailing list Pan-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/pan-users