On Sun, Mar 03, 2002 at 04:00:13AM -0600, Rob VanFleet wrote:
> It seems like this has come up before, but I couldn't turn anything up
> from searching.  Basically, I am looking for a procmail rule that will
> detect html mail, and pipe it to a script to strip the tags from it,
> preferably before the other rules are applied, so it still ends up in
> the proper mailbox, but that is not a necessity.
I use the following script in conclusing with a procmail rule.
It seems to work.

If anyone knows of a program that does the same like the debian mimedecode
please drop me an email.

---<snip>---
#!/bin/sh                                                                       
                        
##                                                                              
                        
##                  strip_html 0.1.2                                            
                        
##                                                                              
                        
## takes an email message from stdin and strips all html off.                   
                        
## the resulting message is printed to stdout.                                  
                        
##                                                                              
                        
## 2002 by Timo Benk <[EMAIL PROTECTED]>                                        
             
##                                                                              
                        
##                                                                              
                        
                                                                                
                        
FILES="mktemp w3m mimedecode sed"                                               
                        
                                                                                
                        
# Lets look if everything we need is in the path                                
                        
for i in $FILES; do                                                             
                        
    if ! which $i >> /dev/null; then                                            
                        
        echo "strip_html: sorry, can't find $i !"                               
                        
        exit 1                                                                  
                        
    fi                                                                          
                        
done                                                                            
                        
                                                                                
                        
TMPFILE=$(mktemp /tmp/strip_html.XXXXXX)                                        
                        
TOKEN="LISTING"                                                                 
                        
                                                                                
                        
echo "<$TOKEN>" > $TMPFILE                                                      
                        
                                                                                
                        
cat /dev/stdin | mimedecode |                                             \     
                        
        sed "s/Content-Type: text\/html/Content-Type: text\/plain/;       \     
                        
             s/<![Dd][Oo][Cc][Tt][Yy][Pp][Ee][^>]*>/<\/$TOKEN>&<$TOKEN>/; \     
                        
             s/<[Hh][Tt][Mm][Ll]>/[HTML stripped]<\/$TOKEN><BR>&/;        \     
                        
             s/<\/[Hh][Tt][Mm][Ll]>/&<$TOKEN>/"                           \     
                        
        >> $TMPFILE                                                             
                        
                                                                                
                        
echo "</$TOKEN>" >> $TMPFILE                                                    
                        
                                                                                
                        
w3m -F -cols 80 -dump -T text/html $TMPFILE                                     
                        
                                                                                
                        
rm $TMPFILE                                                                     
                        
---<snap>---

---<snip>---
##                                                                              
                        
## HTML                                                                         
                        
## strip all html off the email                                                 
                        
##                                                                              
                        
:0 HB:                                                                          
                        
* ^Content-Type: text/html                                                      
                        
{                                                                               
                        
    :0 c                                                                        
                        
    unstripped.backup                                                           
                        
                                                                                
                        
    :0 hbfW                                                                     
                        
    | /home/timo/bin/strip_html                                                 
                        
}                                                                               
                        
---<snap>---

-timo

--
gpg key fingerprint =3D 6832 C8EC D823 4059 0CD1  6FBF 9383 7DBD 109E 98DC


--mYCpIKhGyMATD0i+
Content-Type: application/pgp-signature
Content-Disposition: inline

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.0.6 (GNU/Linux)
Comment: Weitere Infos: siehe http://www.gnupg.org

iD8DBQE8g4JWk4N9vRCemNwRAkkZAJ9Qt+P3a1loag1SWG/kjuSMplB7owCeIkjL
sU/fNkMJOUWp+BICfN2CTKk=
=ltby
-----END PGP SIGNATURE-----

--mYCpIKhGyMATD0i+--

Attachment: pgplgv7szjTu6.pgp
Description: PGP signature

Reply via email to