You could try: http://www.aminus.org/rbre/python/cleanhtml.py
YMMV, as the kids say. But I did choose this over BeautifulSoup or
Strip-o-gram to do this particular thing. I don't remember -why- I
chose it, but there you go. Easy enough to test all three :)
Oh, and if you just want a whole page
Marc Buehler wrote:
> hi.
>
> i have a ton of html files from which i want to
> extract the plain english words, and then write
> those words into a single text file.
If you just want the text from a single tag in the document then BeautifulSoup
will work well, as Danny and Bob suggest. If you h
At 03:50 PM 10/14/2005, Marc Buehler wrote:
>hi.
>
>i have a ton of html files from which i want to
>extract the plain english words, and then write
>those words into a single text file.
http://www.crummy.com/software/BeautifulSoup/ will read the html, let you
step from tag to tag and extract the
On Fri, 14 Oct 2005, Marc Buehler wrote:
> i have a ton of html files from which i want to extract the plain
> english words, and then write those words into a single text file.
Hi Marc,
The BeautifulSoup parser should be able to do what you want:
http://www.crummy.com/software/Beautiful
hi.
i have a ton of html files from which i want to
extract the plain english words, and then write
those words into a single text file.
example:
<... all kinds html tags ...>
this is text
from the above, i want to extract the string
'this is text' and write it out to a text file.
note that