фейсик после ipo сильно потерял в цене акций - им нужна комерциализация
конче ;)
хорошее слово "конче"
нам тоже нужно ентое "конче"
On Tue 09 Oct 2012 07:28:01 EEST, django-developers@googlegroups.com
wrote:
Today's Topic Summary
Group: http://groups.google.com/group/django-developers/topics
* Feature request: collectstatic shouldn't recopy files that already
exist in destination <#group_thread_0> [9 Updates]
Feature request: collectstatic shouldn't recopy files that already
exist in destination
<http://groups.google.com/group/django-developers/t/bed315abc8f09dff>
Dan Loewenherz <d...@dlo.me> Oct 07 08:58PM -0700
This issue just got me again tonight, so I'll try to push once
more on this
issue. It seems right now most people don't care that this is
broken, which
is a bummer, but in which case I'll just continue using my working
solution.
Dan
ptone <pres...@ptone.com> Oct 07 10:38PM -0700
so after scanning this thread and the ticket again - it is still
unclear
that there could be a completely universal solution.
While it would be nice if the storage API had a checksum(name) or
md5(name)
method - not all custom storage backends are going to support a
single
checksum standard. S3 doesn't explicitly support MD5 (apparently it
unofficially does through ETags). Without a universal checksum -
you can't
use it to compare files across arbitrary backends.
I do agree that hacking modified_time return value is a little
ugly - the
API is clearly documented as "returns a datetime..." - so
returning a M55
checksum there is, well, hacky.
If you are passionate about moving this forward, here is what I'd
suggest.
Implement, document, and test .md5(name) as a standard method on
storage
backends - like modified_time this would raise NotImplementedError
if not
available - this could easily be its own ticket. md5 is probably the
closest you'll get to a checksum standard.
Once you have an md5 method defined for backends - you could
support a
--md5 option to collectstatic that would use that as the
target/source
comparison.
Another workaround is to just use collectstatic locally - and rsync
--checksum to your remote if it supports rsync.
-Preston
On Sunday, October 7, 2012 8:59:16 PM UTC-7, Dan Loewenherz wrote:
Jannis Leidel <lei...@gmail.com> Oct 08 12:33PM +0200
> It's accurate *only* in certain situations. And on a distributed
development team, I've run into a lot of issues with developers
re-upload files that have already been uploaded because they just
recently updated their repo.
> A checksum is the only true accurate method to determine if a
file has changed.
> Additionally, you didn't address my point that I quoted from.
Storage backends don't just reflect filesystems--they could
reflect files stored in a database, S3, etc. And some of these
filesystems don't support last modified times.
Then, frankly, this is a problem of the storage backends, not
Django's. The S3BotoStorage backend *does* have a modified_time
method:
https://bitbucket.org/david/django-storages/src/1574890d87be/storages/backends/s3boto.py#cl-298
What storage backend do you use that doesn't have a modified_time
method?
> This is a bit confusing...why call it last_modified when that's
doesn't necessarily reflect what it's doing? It would be more
flexible to create two methods:
It's called modified_time, not last_modified.
> def modification_identifier(self):
> def has_changed(self):
> Then, any backend could implement these however they might like,
and collectstatic would have no excuse in uploading the same file
more than once. Overloading last_modified to also do things like
calculate md5's seems a bit hacky to me, and confusing for any
developer maintaining a custom storage backend that doesn't
support last modified.
I disagree, modified_time is perfectly capable of handling your
use case.
Jannis
Jannis Leidel <lei...@gmail.com> Oct 08 12:50PM +0200
> so after scanning this thread and the ticket again - it is still
unclear that there could be a completely universal solution.
> While it would be nice if the storage API had a checksum(name)
or md5(name) method - not all custom storage backends are going to
support a single checksum standard. S3 doesn't explicitly support
MD5 (apparently it unofficially does through ETags). Without a
universal checksum - you can't use it to compare files across
arbitrary backends.
You're able to ask S3 for the date of last modification, I don't
see why a comparison by hashing the file content is needed
additionally. It'd have to download the full file to do that on
Django's side and I'm not aware of a API for getting a hash from
cloudfiles, S3 etc.
> I do agree that hacking modified_time return value is a little
ugly - the API is clearly documented as "returns a datetime..." -
so returning a M55 checksum there is, well, hacky.
I beg to differ, returning a datetime object makes absolute sense
for comparing it to another datetime object. What I meant before
is that the modified_time method can be written however the user
wants as long as it returns a datetime object, even a date that is
known to be older than the file on disk.
> If you are passionate about moving this forward, here is what
I'd suggest.
> Implement, document, and test .md5(name) as a standard method on
storage backends - like modified_time this would raise
NotImplementedError if not available - this could easily be its
own ticket. md5 is probably the closest you'll get to a checksum
standard.
-1
Jannis
Dan Loewenherz <d...@dlo.me> Oct 08 08:48AM -0700
> The S3BotoStorage backend *does* have a modified_time method:
>
https://bitbucket.org/david/django-storages/src/1574890d87be/storages/backends/s3boto.py#cl-298
> What storage backend do you use that doesn't have a
modified_time method?
I don't think you're seeing the problem I'm having. I'm working with a
distributed team using git. This means when we check out files,
the local
modified time is the time at which I checked the files out, not
the time
which the files were actually last modified.
As a result, it's a questionable metric for figuring out if a file
is the
same or not, since every team member's local machine thinks they
were all
just created! We end up re-uploading the file every time.
> necessarily reflect what it's doing? It would be more flexible
to create
> two methods:
> It's called modified_time, not last_modified.
Sorry, typo.
> seems a bit hacky to me, and confusing for any developer
maintaining a
> custom storage backend that doesn't support last modified.
> I disagree, modified_time is perfectly capable of handling your
use case.
No it does not address my needs, as I described above.
Dan
Dan Loewenherz <d...@dlo.me> Oct 08 08:56AM -0700
> comparison by hashing the file content is needed additionally.
It'd have to
> download the full file to do that on Django's side and I'm not
aware of a
> API for getting a hash from cloudfiles, S3 etc.
S3 stores the md5 info in an Etag header.
Regarding Cloudfiles, this is what Rackspace has to say:
You can ensure end-to-end data integrity by including an MD5
checksum of
> your object's data in the ETag header. You are not required to
include
> the ETag header, but it is recommended to ensure that the
storage system
> successfully stored your object's content.
Dan
ptone <pres...@ptone.com> Oct 08 10:06AM -0700
On Monday, October 8, 2012 8:49:58 AM UTC-7, Dan Loewenherz wrote:
> As a result, it's a questionable metric for figuring out if a
file is the
> same or not, since every team member's local machine thinks they
were all
> just created! We end up re-uploading the file every time.
While git may be common, and your problem not unique - this is
still a
condition of your dev environment rendering modification dates
invalid.
There might be other situations where this is the case (I've run into
scripts that muck with modification dates based on camera/jpeg
metadata).
So after some further discussion on IRC - it was determined that
md5, while
somewhat common, was far from a standard, and was likely not to be
available as remote call for network based storage backends. And
so the
final resolution is to wontfix the ticket.
In the end - this lack of a universal fingerprint is just a
limitation of
our storage tools.
-Preston
Alex Ogier <alex.og...@gmail.com> Oct 08 01:23PM -0400
> In the end - this lack of a universal fingerprint is just a
limitation of
> our storage tools.
> -Preston
Is there a reason this fingerprint must be universal? If you're
dealing
with a backend like S3, where network latency and expensive writes
are a
problem, but md5 is a builtin remote call (available on any GET),
why not
just do an md5 sum in the _save() method? Basically, just buffer
the File
object you receive, take an md5 in python, and then make a
decision whether
to upload or not. In the common case of reading from local disk
and writing
to S3, this is a big win, and doesn't require cooperation from any
other
backends, or standardizing on md5 as a fingerprint method.
Best,
Alex Ogier
Jeremy Dunck <jdu...@gmail.com> Oct 08 08:14PM -0700
Would it be reasonable to have a backend-specific hook to
determine a fingerprint, where that could be mtime or md5 or
whathaveyou as long as equality (or maybe ordering) works?
You received this message because you are subscribed to the Google
Group django-developers.
You can post via email <mailto:django-developers@googlegroups.com>.
To unsubscribe from this group, send
<mailto:django-developers+unsubscr...@googlegroups.com> an empty message.
For more options, visit
<http://groups.google.com/group/django-developers/topics> this group.
--
You received this message because you are subscribed to the Google
Groups "Django developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/django-developers?hl=en.
--
You received this message because you are subscribed to the Google Groups "Django
developers" group.
To post to this group, send email to django-developers@googlegroups.com.
To unsubscribe from this group, send email to
django-developers+unsubscr...@googlegroups.com.
For more options, visit this group at
http://groups.google.com/group/django-developers?hl=en.