Jetus schrieb:
I am able to download this page (enclosed code), but I then want to download a pdf file that I can view in a regular browser by clicking on the "view" link. I don't know how to automate this next part of my script. It seems like it uses Javascript. The line in the page source says href="javascript:openimagewin('JCCOGetImage.jsp? refnum=DN2007036179');" tabindex=-1>So, in summary, when I download this page, for each record, I would like to initiate the "view" link. Can anyone point me in the right direction? When the "view" link is clicked on in IE or Firefox, it returns a pdf file, so I should be able to download it with urllib.urlretrieve('pdffile, 'c:\temp\pdffile') Here is the following code I have been using ---------------------------------------------------------------- import urllib, urllib2 params = [ ('booktype', 'L'), ('book', '930'), ('page', ''), ('hidPageName', 'S3Search'), ('DoItButton', 'Search'),] data = urllib.urlencode(params) f = urllib2.urlopen("http://www.landrecords.jcc.ky.gov/records/ S3DataLKUP.jsp", data) s = f.read() f.close() open('jcolib.html','w').write(s)
Use something like the FireBug-extension to see what the openimagewin-function ultimately creates as reqest. Then issue that, parametrised from parsed information out of the above href.
There is no way to interpret the JS in Python, let alone mimic possible browser dom behavior.
Diez -- http://mail.python.org/mailman/listinfo/python-list
