Re: [Tutor] Error with incorrect encoding

2008-04-17 Thread Alan Gauld
I don't know the cause of the error here but I will say that parsing HTML with regular expressions is fraught with difficulty unless you know that the HTML will be suitably formatted in advance. You may be better off using one of the HTML parsing modules such as HTMLParser or even the more powerfu

Re: [Tutor] Error with incorrect encoding

2008-04-17 Thread linuxian iandsd
Kent was right, >>> print u'\xae'.encode('utf-8') > (R) > but i think you are using the wrong source file, i mean don't copy & paste it from your browsers 'VIEW SOURCE' button. use python native urllib to get the file. ___ Tutor maillist - Tutor@pyth

Re: [Tutor] Error with incorrect encoding

2008-04-15 Thread Kent Johnson
Oleg Oltar wrote: > I am trying to parse an html page. Have following error while doing that > > > src = sel.get_html_source() > links = re.findall(r'', src) > for link in links: > print link Presumably get_html_source() is returning unicode? So link is a unicode st

[Tutor] Error with incorrect encoding

2008-04-15 Thread Oleg Oltar
I am trying to parse an html page. Have following error while doing that src = sel.get_html_source() links = re.findall(r'', src) for link in links: print link == ERROR: test_new (__main__.NewTest)