On 04/03/2011 07:03 AM, Tom Christie wrote:
It's not very obvious from the docs or source if HttpRequest.read() can
always be safely treated as a limited input stream, or if the developer
needs to respect HttpRequest.META['CONTENT_LENGTH'].

As far as I can tell the intention is that it can always be treated as a
limited stream, that seems to at least be the implication in
WSGIRequest.__init__, in which case it looks to me like there are two bugs.

Hello Tom,

I'll do my best to remember my intentions when this code was first written. But anyway it might need some fresh thinking. Thanks!

I've decided not to wrap streams in LimitedStream indefinitely because it would affect performance. I confess guilty for never actually measuring the hit but reading from connection seemed to me a hot enough place to operate as close to the socket as possible, especially because all the thing was made for performance-sensitive cases with potentially big request payloads.

In most practical situations request.read() was working just fine without the wrapper: mod_wsgi, mod_python, flup. The only exception at that time was the Django's runserver that would just hang when trying to read past content_length bytes. Actually there's a big comment on this matter right before LimitedStream wrapping in WSGIRequest.__init__. Also I thought that one could use LimitedStream explicitly in the user code if needed.

1. This code should not raise an Assertion Error...

 >>> from django.test.client import RequestFactory
 >>> req=RequestFactory().post('/', {'foo':'bar'})
 >>> req.read(9999)
Traceback (most recent call last):
File "<console>", line 1, in <module>
File
"/Users/tomchristie/workspace/django-rest-framework/env/lib/python2.6/site-packages/django/http/__init__.py",
line 296, in read
return self._stream.read(*args, **kwargs)
File
"/Users/tomchristie/workspace/django-rest-framework/env/lib/python2.6/site-packages/django/test/client.py",
line 51, in read
assert self.__len >= num_bytes, "Cannot read more than the available
bytes from the HTTP incoming data."
AssertionError: Cannot read more than the available bytes from the HTTP
incoming data.

Judging by svn logs this mock object (FakePayload) was written back at the days of uploads refactoring. Judging by its comment it is a safety measure to fail early and loud to protect client code from even attempting to read input stream past content_length.

So from all these here are my thoughts:

- I still believe that not using the limitator for everything is a good thing (doesn't break real code, doesn't degrade performance)

- It may be a good idea for a FakePayload to imitate real-world data providers and just return empty buffers when there's no data left.

Thoughts?

2. Isn't the use of LimitBytes in MultipartParser.parse() now redundant?

I didn't even know another such thing exists though I remember searching for it before writing LimitedStream. Now they just seem to be duplicates and it's a good idea to ditch LimitBytes (since LimitedStream implements a readline() too) and move it into some common place. http.utils seems a good fit for it.

--
You received this message because you are subscribed to the Google Groups "Django 
developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to 
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at 
http://groups.google.com/group/django-developers?hl=en.

Reply via email to