On 11/04/13 23:33, Scurvy Scott wrote:
the other for something like this. I have no intention of doing
anything professional/shady/annoying with this code and want to write
it purely for my own amusement as well as to learn and obviously to
perhaps win something cool.
Which is fine but you should still check the terms and conditions of the
web sites because many such sites explicitly prohibit the use of web
scrapers. Using one could disqualify you from winning, and disguising
the fact you are using one is non trivial.
Everyday the program visits the site and scrapes the links for all the contests.
The program visits each contest page and verifies there is an entry
form, indicating that the contest is active
If the contest is active at that moment, it adds the title of the page
to a text file, if the contest is inactive it adds the title of the
page to a text file.
If the contest is active, it fills out the form with my details and sends it off
If the contest is inactive the title of the page is added to the
permanently blacklisted text file and never messed with again.
This might be a bit convoluted as well and any pointers are appreciated.
Seems reasonable to me.
Try looking at the http, urllib and cookie stuff in the stdlib.
And then look at tools like Beautiful Soup and Element Tree for the
content scraping bits.
--
Alan G
Author of the Learn to Program web site
http://www.alan-g.me.uk/
_______________________________________________
Tutor maillist - Tutor@python.org
To unsubscribe or change subscription options:
http://mail.python.org/mailman/listinfo/tutor