wrap the queue to multiprocess download

2011-09-08 Thread alias
here is my part of program,you can see main structure,i want to wrap it in 
class ,
class   webdata(object):
def  __init__(self,arg):
jobs = Queue.Queue()
for  x  in  name:
jobs.put(self.url)


  
def download(self):
while not self.jobs.empty():
url = self.jobs.get()
hx = httplib2.Http()
resp, content = hx.request(url, headers=headers).read()
self.jobs.task_done()


def  myrun(self):
for i in range(30):
  threading.Thread(target=self.download).start()
self.jobs.join()

if  __name__=="__main__":

s=webdata('quote')
s.myrun()

when it run ,the output is  :
Traceback (most recent call last):
  File "/usr/lib/python2.7/threading.py", line 552, in __bootstrap_inner
self.run()
  File "/usr/lib/python2.7/threading.py", line 505, in run
self.__target(*self.__args, **self.__kwargs)
  File "/home/pengtao/workspace/try.py", line 75, in download
resp, content = hx.request(url, headers=headers).read()
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 
1288, in request
(scheme, authority, request_uri, defrag_uri) = urlnorm(uri)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 201, 
in urlnorm
(scheme, authority, path, query, fragment) = parse_uri(uri)
  File "/usr/local/lib/python2.7/dist-packages/httplib2/__init__.py", line 197, 
in parse_uri
groups = URI.match(uri).groups()
TypeError: expected string or buffer


when i use the same  structrue, don't   wrap it ,it can run .-- 
http://mail.python.org/mailman/listinfo/python-list


parse html:what is the meaning of "//"?

2011-09-16 Thread alias
code1:
import lxml.html
import urllib
down='http://finance.yahoo.com/q/op?s=C+Options'
content=urllib.urlopen(down).read()
root=lxml.html.document_fromstring(content)
table = root.xpath("//table[@class='yfnc_mod_table_title1']")[0]
tds=table.xpath("tr[@valign='top']//td")
for  td  in tds:
print  td.text_content()

what i get is :
Call Options
Expire at close Friday, September 16, 2011
these are waht i want.

code2
import lxml.html
 import urllib
 down='http://finance.yahoo.com/q/op?s=C+Options'
 content=urllib.urlopen(down).read()
 root=lxml.html.document_fromstring(content)
 table = root.xpath("//table[@class='yfnc_mod_table_title1']")[0]
 tds=table.xpath("//tr[@valign='top']//td")
 for  td  in tds:
 print  td.text_content()

what i get is :
N/A
N/A
2
114
48.00
C110917P00048000
16.75
 0.00
N/A
N/A
0
23
50.00
C110917P0005
23.16
 0.00
N/A
N/A
115
2,411
   
   
   
Highlighted options are in-the-money.
(omit  something)
there is only one difference between   code1 and code2  :
in code1 is :   tds=table.xpath("tr[@valign='top']//td")
in code2 is:   tds=table.xpath("//tr[@valign='top']//td")

i want to know  why  the  "//"  make output  different?-- 
http://mail.python.org/mailman/listinfo/python-list