[ https://issues.apache.org/jira/browse/LIBCLOUD-903?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15978150#comment-15978150 ]
Anthony Shaw commented on LIBCLOUD-903: --------------------------------------- which version of Libcloud was this using? > AWS S3 upload_object_via_stream fails on non-file iterable due to missing > Content-Length header > ----------------------------------------------------------------------------------------------- > > Key: LIBCLOUD-903 > URL: https://issues.apache.org/jira/browse/LIBCLOUD-903 > Project: Libcloud > Issue Type: Bug > Reporter: Richard Xia > > The issue I am seeing appears to be due to the incorrect integration of 4 > separate libraries, but I believe the real problem is here in libcloud, in > the {{upload_object_via_stream()}} method on the S3 storage driver. > > > I am using Python 3.5.1 and the the four libraries I am using are: > > > > * Django 1.10.6 > > * django-storages 1.5.2 > > * libcloud v2.0.0rc1-tentative > > * requests 2.13.0 > > > > Specifically, when I try to use a Django > [ContentFile|https://docs.djangoproject.com/en/1.10/ref/files/file/#django.core.files.base.ContentFile], > Django's own file-like wrapper for strings, to save a new file to S3 via the > Libcloud backend of django-storages, I get the following error: > > > {code:xml} > > <?xml version="1.0" > encoding="UTF-8"?>\n<Error><Code>NotImplemented</Code><Message>A header you > provided implies functionality that is not > implemented</Message><Header>Transfer-Encoding</Header><RequestId>A2FC4D5109083076</RequestId><HostId>K9WGhd18iqQHyIyv+GxWcxHexvapVSidTtHzSqujtT9nT5LhmIEygMKOfR/7F0v7ujnlE/CoYiM=</HostId></Error> > {code} > > > > The reason this happens is because Libcloud is generating an HTTP request to > AWS S3 that is missing the {{Content-Length}} header. AWS S3 requires the > {{Content-Length}} header for file uploads *unless* if it is a multi-part > upload. This is why this used to work on the 1.5.0 release of {{libcloud}}, > because even single-part uploads were done as a one-part multi-part upload. > > > I've traced my bug down through all four libraries and have determined > exactly why the {{Content-Length}} header is missing in my particular use > case. The {{upload_object_via_stream()}} has an {{iterator}} argument that > should yield the content body data, and it eventually passes that argument > directly to the {{requests}} library. The {{requests}} library will actually > [try very hard to add the {{Content-Length}} > header|https://github.com/kennethreitz/requests/blob/c43fefa7ed535c41ba7d58021f0f16ed5ba1d584/requests/models.py#L471], > even for certain types of iterator streams. In particular it can determine > the length of file-like objects which support stat operations and it can > handle StringIO/BytesIO objects. However, the Django {{ContentFile}} is > neither, and {{requests}} cannot extract the length of the stream without > consuming the iterator, so it does not try. > > > > > Here's some (Python 3) code to demonstrate the bug: > > > > {code:python} > > from io import BytesIO > > > > class MyWrapper(object): > > """A contrived wrapper that acts similar to BytesIO.""" > > def __init__(self, content): > > self.content = BytesIO(content) > > > > def __iter__(self): > > self.content.seek(0) > > yield self.content.read() > > > > > > # Assume driver is already set to some S3 provider w/ credentials > > container = driver.get_container(container_name='my-container') > > driver.upload_object_via_stream(iterator=iter(MyWrapper(b'hello world')), > > container=container, > > object_name='my_file.txt') > > {code} > > > > I think the proper solution to this will require all calls to the S3 > {{upload_object_via_stream()}} to use the multi-part uploader in order to > eschew the need for the {{Content-Length}} header. If desired, you could make > the same optimizations that the request library makes by checking for certain > common cases where you do know the file size and only using the multi-part > uploader when necessary. -- This message was sent by Atlassian JIRA (v6.3.15#6346)