Thufir posted on Sat, 15 Sep 2012 16:52:19 -0700 as excerpted: > On Tue, 11 Sep 2012 11:38:58 +1000, Steven D'Aprano wrote: > > >> mutt does something very similar for email. It can be configured to use >> a console browser like links, lynx or w3m to dump the html to text, >> then display that. >> >> http://jasonwryan.com/blog/2012/05/12/mutt/ >> >> I see no reason why Pan couldn't use an external dependancy for this >> like mutt does. As you suggest, stripping tags is hard, there's no >> reason why Pan should be forced to implement its own when it can push >> the hard part onto existing tools, of which there are already at least >> three. > > Ditto. It could even require some configuration, and be a beta feature, > that would be fine.
Note that in theory anyway (the hassle of the "in practice" making it less than ideal, but it /could/ be done), pan's "external editor" feature should be able to be put to use toward this end. Back with "old pan", I had a script, pan-attach-kd, posted here a few times so regulars who have been around long enough may remember it, that could be set as pan's external editor, making use of the feature and kdialog (thus the -kd suffix) to allow one to pick a file and encoding method (yEnc, legacy uuencode, simple pass-thru, the latter being useful for posting plain-text files in-line), then pass that info in turn to uuenview (part of uudeview, a uuencode clone that handled yenc before uuencode got the ability) for encoding. The resulting file would then be appended to the text file that pan had passed to the "external editor", before passing control back to pan, for further editing and/or posting. (Additionally, there was a help option, which gave instructions and listed the external dependencies, those being bash since I don't make a distinction between bash and POSIX-compliant shell code, kdialog for the dialogs, altho there was a VERY crude arbitrary keyword match version implemented as first proof of concept that didn't require kdialog, and of course uudeview. Finally, there was a way to pass thru to an actual external editor as well, if one actually wanted to use the "external editor" function to do just that, without eliminating the pan-attach option.) Unfortunately for pan-attach(-kd), when Charles did the C++ rewrite that started with 0.90, he used a different text-edit widget that worked better with UTF-8 and the like, but broke the 8-bit-zone ASCII that yEnc takes advantage of to make it so much more efficient than UUE/MIME- Base64. So while the script still worked for UUE and text-pass-thru- mode, it was broken for yEnc, which kind of killed its luster. But it was still occasionally used here, until Heinrich came along and FINALLY implemented binary posting mode. Actually, I /still/ use it occasionally, for pass-thru text-file posting or UUE, since pan's binary posting ability is great now... as long as you aren't posting to gmane or other news2mail gateway such that most of your readers will be using mail clients, many of which have no clue about yEnc (as Travis can no doubt attest given his reaction when I tried it here... based on his reply, he doesn't use pan and gmane to read this list, and what he DOES use doesn't do yEnc!)... which means there's still a place for a script that allows pan to post UUE or simple text-as-text, as pan's built-in file posting does NOT, but pan-attach(-kd) still does. My script, in turn, was based on a much older implementation of the same basic idea as posted by someone else, putting the external editor to use for other purposes by calling a script instead, for a different purpose, gpg signing. Pan has that functionality built-in now too, but the point here is that the original idea wasn't mine, I simply reimplemented for attachments, the same idea someone else had used for gpg. Coming full circle back to the present now, the same idea could be used right now for html post dumps. Naturally, by the time the "external editor" gets it, since the intended purpose IS as an "external editor" for replies, the raw HTML has >-quotes prepending every original-text line, plus the attribution prepended at the top and the sig appended at the bottom. That's a bit of a problem, but nothing insurmountable. A suitable html-dump script designed to be set as external editor would therefore have a number of features: * Mandatory: "Reply wrapper" stripping. The script would have to strip the attribution and sig lines, as well as the prepended > quote-marks. * (Semi-)Optional: Depending on the intended HTML parsing target, the script may want to "dress up" the HTML a bit as well, stripping or adding selected tags as necessary to make the (presumed) browser happier with what it's ultimately passed. * Optional but very useful: Let the user configure whether the script simply passes it to the configured browser (presumably firefox/chromium/ etc) for display, or passes it to the configured browser to dump, taking the html-stripped text back to pan, where it would be displayed in the reply window (now repurposed as a simple display window, one wouldn't normally send this plain text on, as it wouldn't be marked as quote, properly attributed, etc) as plain text, now stripped of the HTML. Optional: Further implement an option that would save the attribution, sig, etc, before stripping, then reapply them and re-quote the now- stripped text returned from the HTML parser, for forwarding or reply as plain-text. * Optional: Implement a "pass thru to real external editor" option. When I did that with pan-attach, I made that option dependent on the existence of a particular environmental variable, PAN_EDITOR or some such, that if set (and if the pointed at file is an executable, IDR whether I actually tested for that or not back then, but I think I would now), would pass the raw file as handed to it by pan, on to the "real" external editor. * Optional: Get fancy and include a help/about dialog, etc. Obviously, the last two optional features are interactive and would thus require kdialog/xdialog/zenity/whatever. (AFAIK/IIRC, "zenity" is what the former gdialog is now called.) Tho at least with kde, one could script it using konqueror windows too, as I did here with my hotkeys scripts that replaced the multikey hotkey functionality from khotkeys in kde3, when kde4 broke it (kdialog unfortunately wasn't appropriate for that due to the way they implemented input for the the pick-a-line dialog, but konqueror windows running a script with a read command, triggering on a single key of input, worked well enough). Presumably one could do the same with xterm/gterm/whatever. But a raw reply-wrapper stripper wouldn't need any interactivity just to do that and invoke an HTML parser on the result, so it should actually be rather more straightforward than was my pan-attach script. That said, if it's anything like pan-attach, I won't do anything with it for at least a year after posting this original idea, hoping someone else will be motivated enough to do it first. However, again if it's anything like pan-attach, a year or two down the road (assuming Heinrich or someone else hasn't implemented such a thing in pan itself by then), I'll do a very raw but sort-of-functional proof of concept, and post that, again hoping someone will take the idea and run with it. And again, if it's anything like pan-attach (and if it is, the feature will still not be available in pan, but that was of course quite some years before Heinrich got involved, so...), that raw proof of concept will hit effectively dead-air, not a single response, despite my hope that someone will take the concept and run with it. And yet again, if it's anything like pan-attach, a year or so LATER, I'll decide to see if I can pretty it up a bit and make it more functional, and after I post the results of /that/, I'll FINALLY get some feedback. (I even got a couple patches, which meant at least a couple people found it useful enough to bother, tho I never did implement them and post an update, and when new-pan broke the yenc anyway, I lost the incentive I might have had and just continued to use the existing script.) IOW, don't count on me to do it. I might get to it eventually... but it could be years. If you want the functionality, you'll likely either need to hack up the script yourself or alternatively talk/pay someone into doing it for you (this being my obvious attempt at the talk variant! =:^). And if you do, pretty-please post it. =:^) And if anyone does attempt to hack this up, please at least set variables for things like the chosen browser right at the top of the script (or better yet allow them to be set in the environment or read in from a config file), so others can change them without having to dive into the guts of the script too far. The idea, once implemented, would let a user select the raw HTML in pan, hit the reply button, then the external editor button, to activate the script. The script in turn would strip the attribution, sig, and >- quotes, in addition to any other format manipulations necessary before handing the file off to the configured browser. That browser could then be chromium/firefox/whatever, to simply display the file it was passed, or could be links/lynx/whatever, to strip the HTML and hand back the stripped plain text to pan, where it would then appear in pan's reply window. -- Duncan - List replies preferred. No HTML msgs. "Every nonfree program has a lord, a master -- and if you use the program, he is your master." Richard Stallman _______________________________________________ Pan-users mailing list Pan-users@nongnu.org https://lists.nongnu.org/mailman/listinfo/pan-users