On 21-Dec-10 12:22 PM, Jon Clements wrote:
import lxml from urlparse import urlsplitdoc = lxml.html.parse('http://www.google.com') print map(urlsplit, doc.xpath('//a/@href')) [SplitResult(scheme='http', netloc='www.google.co.uk', path='/imghp', query='hl=en&tab=wi', fragment=''), SplitResult(scheme='http', netloc='video.google.co.uk', path='/', query='hl=en&tab=wv', fragment=''), SplitResult(scheme='http', netloc='maps.google.co.uk', path='/maps', query='hl=en&tab=wl', fragment=''), SplitResult(scheme='http', netloc='news.google.co.uk', path='/nwshp', query='hl=en&tab=wn', fragment=''), ...]
Jon, What version of Python was used to run this? Colin W. -- http://mail.python.org/mailman/listinfo/python-list
