[Tutor] Retrieving Webpage Source, a Problem with 'onclick'

2005-05-21 Thread Craig Booth
Hi,

   I am trying to loop over all of the links in a given webpage and
retrieve the source of each of the child pages in turn.

   My problem is that the links are in the following form:

[begin html]
link1
link2
link3
link4
[end html]

  So clicking the links appears to call the Javascript function gS to
dynamically create pages.

  I can't figure out how to get urllib/urllib2 to work here as the URL of
each of these links is http://www.thehomepage.com/#.

  I have tried to get mechanize to click each link, once again it doesn't
send the onclick request and just goes to http://www.thehomepage.com/#

This blog (http://blog.tomtebo.org/programming/lagen.nu_tech_2.html)
strongly suggests that the easiest way to do this is to use IE and COM
automation (which is fine as I am working on a windows PC) so I have tried
importing win32com.client and actually getting IE to click the link:

[begin code]

ie = Dispatch("InternetExplorer.Application")
ie.Visible = 1
ie.Navigate('http://www.thehomepage.com')

#it takes a little while for page to load
if ie.Busy:
sleep(2)

#Print page title
print ie.LocationName

test=ie.Document.links
ie.Navigate(ie.Document.links(30))

[end code]

  Which should just click the 30th link on the page.  As with the other
methods this takes me to http://www.thehomepage/# and doesn't call the
Javascript.

   If somebody who has more experience in these matters could suggest a
course of action I would be grateful.  I'm more than happy to use any
method (urllib, mechanize, IE & COM as tried so far) just so long as it
works :)

   Thanks in advance,
  Craig.

___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor


Re: [Tutor] Web browser

2005-06-11 Thread Craig Booth
Hi,

   As was recommended to me in a previous thread if you're on a
Windows machine with IE installed then PAMIE
(http://www.pamie.sourceforge.net) can simplify using the IE COM and does
nearly everything you need.

   Pamie doesn't support frames correctly yet, but it is very easy to hack
in support as and where it is needed.


--Craig

On Sat, 11 Jun 2005, Ismael Garrido wrote:

> Hello.
>
> I've been looking around for a web browser either written in python, or
> with python bindings.
> What I need to do is load a web-page, enter a password-protected site
> and follow certain links, it needs to have frames and follow the refresh
> meta. I'm running winxp, python 2.4
>
> At first I thought about using a python web browser, but none I could
> find had frames support. The most promising ones were quite old too. And
> one that was completly written in Python used and old version and just
> kept crashing in python 2.4.
> Then, I thought about using some browser with bindings. I've looked all
> over and found about nothing. Mozilla and its xpcom just seems quite
> hard and I'm not sure if that does the job I want. I tried finding COM
> bindings in other browsers, but I coudn't understand them and make them
> work...
>
> Any suggestion would be greatly appreciated.
> Ismael
> ___
> Tutor maillist  -  Tutor@python.org
> http://mail.python.org/mailman/listinfo/tutor
>
___
Tutor maillist  -  Tutor@python.org
http://mail.python.org/mailman/listinfo/tutor