Re: [dev] Stripping html from email

2010-08-26 Thread Kris Maglione
On Thu, Aug 26, 2010 at 11:24:11AM +0100, Kai Hendry wrote: I noticed no one mentioned http://packages.qa.debian.org/m/mpack.html `munpack` Indeed, I've been using mpack and ripmime for years, but I think that altermime would be cleaner in this case. -- Kris Maglione Religion began when the

Re: [dev] Stripping html from email

2010-08-26 Thread Josh Rickmar
On Tue, Aug 24, 2010 at 04:58:20PM +0200, pancake wrote: > there's dmc-pack to unpack and unpack mime attachments. The > implementation is 162 LOC and works quite nice. I think is the > sanest way to work with it. dmc looks like it could be just what I need, unfortunately I can't compile it on Ope

Re: [dev] Stripping html from email

2010-08-26 Thread Szabolcs Nagy
* Antoni Grzymala [2010-08-26 12:39:33 +0200]: > [1] uri://some.url... > > notation, so that I can actually fish out the links. Is that possible > in w3c as well? > in interactive mode with 'L' you can list links and images but i don't think there is a command line switch for that in general w3

Re: [dev] Stripping html from email

2010-08-26 Thread Antoni Grzymala
Suraj Kurapati dixit (2010-08-23, 21:05): > On Mon, Aug 23, 2010 at 8:46 PM, Anthony J. Bentley > wrote: > >> Is there currently a tool or script that I can use to strip html > >> from emails? > > > > mhshow-show-text/html: lynx -dump %F | less > > > > Lynx sucks but it sorta works well enough he

Re: [dev] Stripping html from email

2010-08-26 Thread Kai Hendry
I noticed no one mentioned http://packages.qa.debian.org/m/mpack.html `munpack` I noticed this as I began working on a maildir -> Web archive thing last Sunday http://m.dabase.com/ Very early days still. I will definitely consider dmc-unpack instead of course.

Re: [dev] Stripping html from email

2010-08-26 Thread Nick
Quoth pancake: > there's dmc-pack to unpack and unpack mime attachments. The > implementation is 162 LOC and works quite nice. I think is the sanest > way to work with it. Just took a look at dmc. It looks really nice. I enjoyed reading the code. Just a quick question; how are you planning to

Re: [dev] Stripping html from email

2010-08-25 Thread Robert Ransom
On Wed, 25 Aug 2010 22:31:58 -0400 Josh Rickmar wrote: > Where can I get > the dmc source again? See . Robert Ransom signature.asc Description: PGP signature

Re: [dev] Stripping html from email

2010-08-25 Thread Josh Rickmar
On Tue, Aug 24, 2010 at 04:58:20PM +0200, pancake wrote: > On 08/24/10 16:45, Kurt H Maier wrote: > >MIME sucks; there's no nice way to deal with it. I use perl and the > there's dmc-pack to unpack and unpack mime attachments. The > implementation is 162 LOC and works quite nice. I think is the >

Re: [dev] Stripping html from email

2010-08-24 Thread anonymous
On Tue, Aug 24, 2010 at 04:26:46PM -0700, Robert Ransom wrote: > On Tue, 24 Aug 2010 20:01:10 +0400 > The ‘tdb’ library is actually LGPLed. Ok, tdb.h says it is under LGPL. But both on SourceForge page and in Arch Linux package it is said it is under GPLv3. Probably it was just copied from Sourc

Re: [dev] Stripping html from email

2010-08-24 Thread Robert Ransom
On Tue, 24 Aug 2010 20:01:10 +0400 anonymous wrote: > Looks like it is BSD licensed but uses tdb that is GPLv3 licensed. Is > it ok? The ‘tdb’ library is actually LGPLed. Robert Ransom signature.asc Description: PGP signature

Re: [dev] Stripping html from email

2010-08-24 Thread Uriel
On Tue, Aug 24, 2010 at 4:45 PM, Kurt H Maier wrote: > MIME sucks; there's no nice way to deal with it. Indeed. http://harmful.cat-v.org/software/mime uriel

Re: [dev] Stripping html from email

2010-08-24 Thread anonymous
On Mon, Aug 23, 2010 at 11:55:35PM -0400, Josh Rickmar wrote: > Yeah, not quite what I'm looking for. Basically I want something > that I can pipe the message to with my MDA (fdm) before it is > delievered to my maildir. Thanks, I didn't know about fdm and used getmail+procmail. Now I have switc

Re: [dev] Stripping html from email

2010-08-24 Thread pancake
On 08/24/10 16:45, Kurt H Maier wrote: MIME sucks; there's no nice way to deal with it. I use perl and the there's dmc-pack to unpack and unpack mime attachments. The implementation is 162 LOC and works quite nice. I think is the sanest way to work with it.

Re: [dev] Stripping html from email

2010-08-24 Thread Kurt H Maier
On Tue, Aug 24, 2010 at 9:27 AM, Josh Rickmar wrote: > anonymous is right, I just want to remove the text/html attachments, > not strip the html tags. MIME sucks; there's no nice way to deal with it. I use perl and the Mail::Message package from cpan. -- #!/usr/bin/perl use Mail::Message;

Re: [dev] Stripping html from email

2010-08-24 Thread Josh Rickmar
On Tue, Aug 24, 2010 at 09:07:25AM -0400, Kurt H Maier wrote: > On Tue, Aug 24, 2010 at 9:01 AM, anonymous wrote: > > But it is not what OP asks for. ?Tool should process MIME emails and > > remove text/html attachments. > > that is a different task than stripping html from email data. OP > shou

Re: [dev] Stripping html from email

2010-08-24 Thread Kurt H Maier
On Tue, Aug 24, 2010 at 9:01 AM, anonymous wrote: > But it is not what OP asks for.  Tool should process MIME emails and > remove text/html attachments. that is a different task than stripping html from email data. OP should be looking for two tools. -- # Kurt H Maier

Re: [dev] Stripping html from email

2010-08-24 Thread anonymous
On Tue, Aug 24, 2010 at 08:57:12AM -0400, Kurt H Maier wrote: > On Tue, Aug 24, 2010 at 8:38 AM, Nick wrote: > > On Tue, Aug 24, 2010 at 07:31:18AM -0400, Kurt H Maier wrote: > >> http://search.cpan.org/~kilinrax/HTML-Strip-1.06/Strip.pm > > > > Umm. Is no-one reading the body of the original requ

Re: [dev] Stripping html from email

2010-08-24 Thread Kurt H Maier
On Tue, Aug 24, 2010 at 8:38 AM, Nick wrote: > On Tue, Aug 24, 2010 at 07:31:18AM -0400, Kurt H Maier wrote: >> http://search.cpan.org/~kilinrax/HTML-Strip-1.06/Strip.pm > > Umm. Is no-one reading the body of the original request? We can all > strip XML easily, that isn't the question. On Mon, A

Re: [dev] Stripping html from email

2010-08-24 Thread pancake
On 08/24/10 14:38, Nick wrote: On Tue, Aug 24, 2010 at 07:31:18AM -0400, Kurt H Maier wrote: http://search.cpan.org/~kilinrax/HTML-Strip-1.06/Strip.pm Umm. Is no-one reading the body of the original request? We can all strip XML easily, that isn't the question. pacman -S html2text

Re: [dev] Stripping html from email

2010-08-24 Thread pancake
On 08/24/10 05:46, Anthony J. Bentley wrote: Is there currently a tool or script that I can use to strip html from emails? Basically, it should work like this: - Read the message from stdin - If there is no html, leave as is - If it finds both html and plain text, strip the html attachment - I

Re: [dev] Stripping html from email

2010-08-24 Thread Nick
On Tue, Aug 24, 2010 at 07:31:18AM -0400, Kurt H Maier wrote: > http://search.cpan.org/~kilinrax/HTML-Strip-1.06/Strip.pm Umm. Is no-one reading the body of the original request? We can all strip XML easily, that isn't the question.

Re: [dev] Stripping html from email

2010-08-24 Thread Kurt H Maier
http://search.cpan.org/~kilinrax/HTML-Strip-1.06/Strip.pm -- # Kurt H Maier

Re: [dev] Stripping html from email

2010-08-24 Thread Etienne Millon
On Tue, Aug 24, 2010 at 07:45:17AM +0100, Kai Hendry wrote: > It would be great if there was a tool to convert HTML to markdown. ;) Actually, pandoc can do that. :-) -- Etienne Millon

Re: [dev] Stripping html from email

2010-08-24 Thread Anselm R Garbe
On Mon, Aug 23, 2010 at 10:55:14PM -0500, Stanley Lieber wrote: > On Mon, Aug 23, 2010 at 10:46 PM, Anthony J. Bentley > wrote: > > > > It’s not quite what you’re asking for, but I have nmh set up like this: > > mhshow-show-text/html: lynx -dump %F | less > > > > Lynx sucks but it sorta works well

Re: [dev] Stripping html from email

2010-08-23 Thread Kai Hendry
It would be great if there was a tool to convert HTML to markdown. ;)

Re: [dev] Stripping html from email

2010-08-23 Thread Benjamin R. Haskell
On Mon, 23 Aug 2010, Suraj Kurapati wrote: > On Mon, Aug 23, 2010 at 8:46 PM, Anthony J. Bentley wrote: > >> Is there currently a tool or script that I can use to strip html > >> from emails? > > > > mhshow-show-text/html: lynx -dump %F | less > > > > Lynx sucks but it sorta works well enough her

Re: [dev] Stripping html from email

2010-08-23 Thread Suraj Kurapati
On Mon, Aug 23, 2010 at 8:46 PM, Anthony J. Bentley wrote: >> Is there currently a tool or script that I can use to strip html >> from emails? > > mhshow-show-text/html: lynx -dump %F | less > > Lynx sucks but it sorta works well enough here, I guess. I find that w3m does a much better job of HTM

Re: [dev] Stripping html from email

2010-08-23 Thread Stanley Lieber
On Mon, Aug 23, 2010 at 10:46 PM, Anthony J. Bentley wrote: > > It’s not quite what you’re asking for, but I have nmh set up like this: > mhshow-show-text/html: lynx -dump %F | less > > Lynx sucks but it sorta works well enough here, I guess. also see htmlfmt: http://swtch.com/plan9port/man/man1

Re: [dev] Stripping html from email

2010-08-23 Thread Josh Rickmar
On Mon, Aug 23, 2010 at 09:46:58PM -0600, Anthony J. Bentley wrote: > > Is there currently a tool or script that I can use to strip html > > from emails? Basically, it should work like this: > > > > - Read the message from stdin > > - If there is no html, leave as is > > - If it finds both html a

Re: [dev] Stripping html from email

2010-08-23 Thread Anthony J. Bentley
> Is there currently a tool or script that I can use to strip html > from emails? Basically, it should work like this: > > - Read the message from stdin > - If there is no html, leave as is > - If it finds both html and plain text, strip the html attachment > - If it finds html but no plain text,

[dev] Stripping html from email

2010-08-23 Thread Josh Rickmar
Is there currently a tool or script that I can use to strip html from emails? Basically, it should work like this: - Read the message from stdin - If there is no html, leave as is - If it finds both html and plain text, strip the html attachment - If it finds html but no plain text, leave as is