Re: HTML page into a string

Jason Earl Tue, 07 Feb 2006 20:35:45 -0800

"Tempo" <[EMAIL PROTECTED]> writes:

> In my last post I received some advice to use urllib.read() to get a
> whole html page as a string, which will then allow me to use
> BeautifulSoup to do what I want with the string. But when I was
> researching the 'urllib' module I couldn't find anything about its
> sub-section '.read()' ? Is that the right module to get a html page
> into a string? Or am I completely missing something here? I'll take
> this as the more likely of the two cases. Thanks for any and all help.



Here's a short example of how this all works:

#!/usr/bin/env python

import urllib2
from BeautifulSoup import BeautifulSoup

response = urllib2.urlopen('http://www.cnn.com')
soup = BeautifulSoup(response)
print soup.prettify()

It's not a particularly useful example, unless, of course, you wish to
prettify cnn's html, but it should get you to the point where
BeautifulSoup's documentation starts to make sense.

Jason
-- 
http://mail.python.org/mailman/listinfo/python-list

Re: HTML page into a string

Reply via email to