"Tempo" <[EMAIL PROTECTED]> writes:
> In my last post I received some advice to use urllib.read() to get a
> whole html page as a string, which will then allow me to use
> BeautifulSoup to do what I want with the string. But when I was
> researching the 'urllib' module I couldn't find anything about its
> sub-section '.read()' ? Is that the right module to get a html page
> into a string? Or am I completely missing something here? I'll take
> this as the more likely of the two cases. Thanks for any and all help.
Here's a short example of how this all works:
#!/usr/bin/env python
import urllib2
from BeautifulSoup import BeautifulSoup
response = urllib2.urlopen('http://www.cnn.com')
soup = BeautifulSoup(response)
print soup.prettify()
It's not a particularly useful example, unless, of course, you wish to
prettify cnn's html, but it should get you to the point where
BeautifulSoup's documentation starts to make sense.
Jason
--
http://mail.python.org/mailman/listinfo/python-list