Thanks Tim for the info.
This is the discussion mentioned in the ticket (from
2012) https://groups.google.com/d/topic/django-developers/vtMVq8jwnf8/discussion
The solutions that ptone suggests in the ticket don't really work for
Heroku. Also, to sync static files from local is not a good solution for
example when using CI. And still there's the situation when trying to
upload old files, like for example during a rollback.
At the end, the main problem is that collectstatic is using two different
backends. One is being provided by each of the static file finders
(settings.STATICFILE_FINDERS) and the other one is the one defined in
settings.STATICFILES_STORAGE. As there's not a standard hash method, the
Storage superclass can't force to implement a standard hash method for all
its subclasses.
Maybe a solution would be to shift the responsibility of detecting a file
change from collectstatic to the STATICFILES_STORAGE? In this way we
provide the flexibility of letting the Storage subclasses to decide how
they want to check if a file has changed, they can use any technique they
like and keep consistent.
A rough and simplified example:
# django/core/files/storage.py
class Storage(object):
def has_changed(self, source_storage, source_path, path):
raise NotImplementedError()
class FileSystemStorage(Storage):
def has_changed(self, source_storage, source_path, path):
return source_storage.modified_time(source_path) >
self.modified_time(path)
# django/contrib/staticfiles/management/commands/collectstatic.py
class Command(BaseCommand):
def delete_file(self, path, prefixed_path, source_storage):
if self.storage.has_changed(source_storage, path, prefixed_path):
self.storage.delete(prefixed_path)
And then, anyone could do this in their own project (or even in
django-storages):
# my_app/storages/custom_s3_storage.py
class MyStorage(S3BotoStorage):
def has_changed(self, source_storage, source_path, path):
try:
local_md5 = source_storage.get_md5(source_path)
except (NotImplementedError, AttributeError):
with source_storage.open(source_path) as source_file:
local_md5 = hashlib.md5(source_file.read()).hexdigest()
return self.get_md5(path) != local_md5
def get_md5(self, path):
return self.bucket.get_key(path).md5
It keeps backward compatibility and allows the possibility to use any
comparison method by any Storage subclass.
On Friday, April 15, 2016 at 1:34:19 AM UTC+1, Tim Graham wrote:
>
> A proposal to use checksums was closed as wontfix in
> https://code.djangoproject.com/ticket/19021.
>
> On Thursday, April 14, 2016 at 1:16:39 PM UTC-4, bliy...@rentlytics.com
> wrote:
>>
>> This makes a lot of sense to me.
>>
>> On Tuesday, April 12, 2016 at 9:07:51 AM UTC-7, Daniel Blasco wrote:
>>>
>>> Hi,
>>>
>>> I posted this in django-users but I think that it goes better here.
>>>
>>>
>>> I'm using django-storages to upload my static files to Amazon S3 and I'm
>>> serving my application from Heroku.
>>>
>>> In my local development, when I run collectstatic for a second time just
>>> after the first one, no files are being uploaded to S3 because
>>> collectstatic checks for the modified_time to determine if the local files
>>> are newer than the ones in S3. That's fine so far.
>>>
>>> The problem is when I deploy to Heroku. Collectstatic is being executed
>>> from the Heroku server and absolutely all the files are always being
>>> uploaded to S3, even the ones that have not changed. This is because during
>>> the deployment Heroku creates a full copy of the source code, and therefore
>>> all the files have a new modified_time. In my case, it takes almost 10
>>> minutes to upload ~1000 files for each deployment.
>>>
>>> Also, imagine the situation where the modified_times are not being
>>> changed and I wanted to upload older versions of the static files. I wont
>>> be able because storage wouldn't allow to upload files with an older
>>> modified_time.
>>>
>>> I think that a more accurate way to check if a file needs to be replaced
>>> could be by comparing their checksum/hash and offer this feature for all
>>> the Storage subclasses. To preserve backwards compatibility, in
>>> collectstatic command first determine if the storage subclass implements a
>>> checksum generation and otherwise fallback to modified_time comparison.
>>>
>>>
>>> What do you think, is this something that makes sense?
>>>
>>
--
You received this message because you are subs