[issue2868] Problem with urllib and urllib2 in urlopen?

2008-05-15 Thread Ambarish Malpani

New submission from Ambarish Malpani <[EMAIL PROTECTED]>:

I have the following code:

import urllib
u = 'http://www.mercurynews.com/ci_9216417'
h = urllib.urlopen(u).read()
print h
# Get an empty string
#(can use urllib2 also - get the same behavior)

If I visit the same page with my browser, get the contents of the page
(after some redirects...)

--
components: Extension Modules
messages: 66872
nosy: ambarish
severity: normal
status: open
title: Problem with urllib and urllib2 in urlopen?
type: behavior
versions: Python 2.5

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2868>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2776] urllib2.urlopen() gets confused with path with // in it

2008-05-06 Thread Ambarish Malpani

New submission from Ambarish Malpani <[EMAIL PROTECTED]>:

Try the following code:
import urllib
import urllib2

url =
'http://features.us.reuters.com//autos/news/95ED98EE-A837-11DC-BCB3-4F218271.html'

data = urllib.urlopen(url).read()
data2 = urllib2.urlopen(url).read()

The attempt to get it with urllib works fine. With urllib2, the request
is malformed and I get back a HTTP 404

Request in the 2nd case is:
GET //autos/news/95ED98EE-A837-11DC-BCB3-4F218271.html HTTP/1.1\r\n
Accept-Encoding: identity\r\n
Host: autos\r\n
Connection: close\r\n


The host line seems to be looking for the last // rather than the first.

--
components: Extension Modules
messages: 66334
nosy: ambarish
severity: normal
status: open
title: urllib2.urlopen() gets confused with path with // in it
type: behavior
versions: Python 2.5

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2776>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com



[issue2776] urllib2.urlopen() gets confused with path with // in it

2008-05-06 Thread Ambarish Malpani

Ambarish Malpani <[EMAIL PROTECTED]> added the comment:

Sorry, should have added another line:
The reason this is important to fix, is I am getting that URL with a //
in a Moved (HTTP 302) message, so I can't just get rid of the //

__
Tracker <[EMAIL PROTECTED]>
<http://bugs.python.org/issue2776>
__
___
Python-bugs-list mailing list
Unsubscribe: 
http://mail.python.org/mailman/options/python-bugs-list/archive%40mail-archive.com