> When you hit the page and you get an HTTP redirect code back (say, > 302), you will need to make another call to the URL specified in the > "Location" parameter in the response headers. Then you retrieve that > new page and you can check you got an acceptable HTTP response code > (such as 200) and read the page's body (or whatever you want to do > with it). Otherwise, keep looping until you get an expected HTTP > response code. > > Note: you may get stuck in an infinite loop if two URLs redirect to each > other. > > You might want to take a look at the higher level httplib module: > http://docs.python.org/library/httplib.html > > Although I don't think it can automatically follow redirects for you. > You'll have to implement the loop yourself. > > If you can rely on 3rd party packages (not part of the standard Python > library), take a look at httplib2: > https://httplib2.googlecode.com/hg/doc/html/libhttplib2.html > > This one can follow redirects. > > HTH,
Sorry for bringing up an old topic like this, but writing longer messages on a phone is just not something that I want to do. Python already has the urllib/urllib2 package that automatically follow redirects, so I don't see why you'd need a 3rd-party module to deal with it? When it encounters a 301 status code from the server, urllib2 will search through its handlers and call the http_error_301 method, which will look for the Location: header and follow that address. The behaviour is defined in HTTPRedirectHandler, which can be overridden if necessary: >>> help(urllib.request.HTTPRedirectHandler) Help on class HTTPRedirectHandler in module urllib.request: class HTTPRedirectHandler(BaseHandler) | Method resolution order: | HTTPRedirectHandler | BaseHandler | builtins.object | | Methods defined here: | | http_error_301 = http_error_302(self, req, fp, code, msg, headers) | | http_error_302(self, req, fp, code, msg, headers) | # Implementation note: To avoid the server sending us into an | # infinite loop, the request object needs to track what URLs we | # have already seen. Do this by adding a handler-specific | # attribute to the Request object. | | http_error_303 = http_error_302(self, req, fp, code, msg, headers) | | http_error_307 = http_error_302(self, req, fp, code, msg, headers) | | redirect_request(self, req, fp, code, msg, headers, newurl) | Return a Request or None in response to a redirect. | | This is called by the http_error_30x methods when a | redirection response is received. If a redirection should | take place, return a new Request to allow http_error_30x to | perform the redirect. Otherwise, raise HTTPError if no-one | else should try to handle this url. Return None if you can't | but another Handler might. | | ---------------------------------------------------------------------- | Data and other attributes defined here: | | inf_msg = 'The HTTP server returned a redirect error that w...n infini... | | max_redirections = 10 | | max_repeats = 4 | | ---------------------------------------------------------------------- | Methods inherited from BaseHandler: | | __lt__(self, other) | | add_parent(self, parent) | | close(self) | | ---------------------------------------------------------------------- | Data descriptors inherited from BaseHandler: | | __dict__ | dictionary for instance variables (if defined) | | __weakref__ | list of weak references to the object (if defined) | | ---------------------------------------------------------------------- | Data and other attributes inherited from BaseHandler: | | handler_order = 500 best regards, Robert S. _______________________________________________ Tutor maillist - Tutor@python.org To unsubscribe or change subscription options: http://mail.python.org/mailman/listinfo/tutor