Yes there's a risk of causing collisions if a user can control the responses. But that doesn't mean there's really a security concern - since the user creating the collision *can already upload arbitrary content to the django site.* The collision wouldn't really do anything but cause a stale asset to appear - which is a known risk with conditional GET's to begin with.
Additionally, e-tag calculation needs to be *fast* since it is run on every response. So moving to a slower hashing algorithm like SHA would be a performance degradation. There are some other deliberately fast hashing algorithms out there, though I don't know if they carry any advantages in this situation over MD5. For those concerned, we could make etag collisions slightly less easy to forge by including a salt. Perhaps this could be derived from SECRET_KEY or some other setting. I'm not sure SECRET_KEY is appropriate though... if we used that SECRET_KEY because then we'd be sending a payload of content, with etag=MD5(content + SECRET_KEY), which could be cracked to retrieve SECRET_KEY. ConditionalGetMiddleware only sets the ETag header if it's not already set. So if you're interested, you can experiment with different ETag algorithms by adding a custom middleware below it in your MIDDLEWARE setting. Before we change anything it would also be useful to see a summary of how other major servers and frameworks have settled on creating ETags for dynamic content - such as Ruby, Nginx, Apache, Nginx, CloudFlare. If there's any problem with collisions they'll likely have already solved it. one thing to consider might be that md5 is usually disabled for FIPS > enabled system > This is true, but if no one has complained, why "fix" it? As covered above one can always implement custom ETag header generation. On Mon, 14 Sep 2020 at 08:41, Florian Apolloner <f.apollo...@gmail.com> wrote: > Hi, > > one thing to consider might be that md5 is usually disabled for FIPS > enabled system (ie https://code.djangoproject.com/ticket/28401 ). So if > we are changing something here we might also consider this. > > Cheers, > Florian > > On Friday, September 11, 2020 at 9:41:51 AM UTC+2 gertb...@gmail.com > wrote: > >> I'm not so sure this is a problem (wrt to using md5 hash of response >> content for ETags and likely also for cache keys). The probability of a >> naturally occurring collision with MD5 is 1.47*10-29 [1] so the risk of >> this scenario occurring by accident is extremely remote. >> >> If we assume that User 2 is a malicious actor and can reliably generate >> collisions (which is not practically possible within typical cache >> durations), then User 1 is likely still no worse off. Maybe some security >> researchers can comment on the threat model in this case. >> >> Lastly, the practical use of content hash based ETags are limited due to >> the processing overhead required for them (as implemented in >> ConditionalGetMiddleware). It might be useful for a low traffic/value >> deployments and then the risks are likely not a problem. Switching to a >> more cryptographically secure hashing algorithm will increase the overhead >> and provide little additional benefit to such installations. As mentioned >> earlier, production deployments will often use Apache or Nginx, and neither >> generate ETags using file content hashes. >> >> [1] https://www.avira.com/en/blog/md5-the-broken-algorithm >> >> On Fri, 11 Sep 2020 at 08:25, Francisco Couzo <franci...@gmail.com> >> wrote: >> >>> If changing ConditionalGetMiddleware to use SHA-256 >>> It also might be a good to change it on FileBasedCache, >>> _generate_cache_key, and generate_cache_header_key too >>> Also, _generate_cache_key is just blindly concatenating hashes, so >>> ['foo', 'bar'] is equal to ['fo', 'obar'], I don't know it might be a >>> problem, but it just doesn't looks right >>> >>> >>> On Thu, Sep 10, 2020 at 10:14 PM Taymon A. Beal <taymo...@gmail.com> >>> wrote: >>> >>>> That attack doesn't work with the recommended production setup because >>>> Django doesn't serve uploaded files in that setup. >>>> >>>> That being said, some users might be doing that anyway since setting >>>> up production-worthy upload hosting is such a pain, and even if they >>>> don't, they might have other views that somehow allow users to control >>>> the response body. So I think this should be treated as a >>>> not-super-severe-but-still-worth-fixing security issue. >>>> >>>> What backwards-compatibility considerations exist? Do we consider it >>>> normal for upgrading to a different Django version to bust users' >>>> caches? I can't immediately think of any bad consequences of changing >>>> the hash function, apart from that one. Busting users' caches doesn't >>>> sound that terrible, given that, even if the hash function were >>>> changed on every release (which of course it wouldn't be; SHA-2 has >>>> been the most generally-recommended hash function for 15 years and >>>> there are no signs that this will change), it would still only happen >>>> once every eight months, and it's fairly rare for anything to be >>>> cached that long in the first place, I think. >>>> >>>> Taymon >>>> >>>> >>>> On Thu, Sep 10, 2020 at 1:16 PM Francisco Couzo <franci...@gmail.com> >>>> wrote: >>>> > >>>> > User 1 uploads a file >>>> > User 2 downloads it, and caches it >>>> > User 1 uploads a new file to the same URL, with the same MD5 hash >>>> > User 2 will keep using the old file indefinitely >>>> > >>>> > Sure, user 1 has to upload two files with the same hash on purpose >>>> > >>>> > On Thu, Sep 10, 2020 at 11:07 AM Adam Johnson <m...@adamj.eu> wrote: >>>> >> >>>> >> What would this protect against? >>>> >> >>>> >> On Thu, 10 Sep 2020 at 03:56, Francisco Couzo <franci...@gmail.com> >>>> wrote: >>>> >>> >>>> >>> I think it would be a good idea to make ConditionalGetMiddleware >>>> use a hash function that's not as easy to find a collision as MD5, most >>>> probably SHA-256 or BLAKE2. >>>> >>> I don't see a problem with just changing it, it will just >>>> invalidate the old cache. >>>> >>> If there's an agreement on changing the hash function, I can make >>>> the PR. >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> >>>> >>> -- >>>> >>> >>>> >>> >>>> >>> You received this message because you are subscribed to the Google >>>> Groups "Django developers (Contributions to Django itself)" group. >>>> >>> >>>> >>> >>>> >>> To unsubscribe from this group and stop receiving emails from it, >>>> send an email to django-develop...@googlegroups.com. >>>> >>> >>>> >>> >>>> >>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/django-developers/ff591d46-97fc-43d6-9d1c-d0ba24d7b1a8n%40googlegroups.com >>>> . >>>> >>> >>>> >>> >>>> >> -- >>>> >> Adam >>>> >> >>>> >> -- >>>> >> You received this message because you are subscribed to the Google >>>> Groups "Django developers (Contributions to Django itself)" group. >>>> >> To unsubscribe from this group and stop receiving emails from it, >>>> send an email to django-develop...@googlegroups.com. >>>> >> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/django-developers/CAMyDDM1Db5tRDNXQEMLuX6UdOmjhUq0_-Dr_9kK%3DB5AUqsXG%2Bg%40mail.gmail.com >>>> . >>>> > >>>> > -- >>>> > You received this message because you are subscribed to the Google >>>> Groups "Django developers (Contributions to Django itself)" group. >>>> > To unsubscribe from this group and stop receiving emails from it, >>>> send an email to django-develop...@googlegroups.com. >>>> > To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/django-developers/CAHx8S1vpoP9txnZrV4fTfwRJBCF2q-QtjnYpw%2BwAgGKbg4V5yQ%40mail.gmail.com >>>> . >>>> >>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "Django developers (Contributions to Django itself)" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to django-develop...@googlegroups.com. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/django-developers/CAHRQ%3D85KUrrsv%3DeY7YW%3DXb%3DeD%2BdY9-eeeQQW0YuB5ckDmUgfEg%40mail.gmail.com >>>> . >>>> >>> -- >>> You received this message because you are subscribed to the Google >>> Groups "Django developers (Contributions to Django itself)" group. >>> To unsubscribe from this group and stop receiving emails from it, send >>> an email to django-develop...@googlegroups.com. >>> >> To view this discussion on the web visit >>> https://groups.google.com/d/msgid/django-developers/CAHx8S1tk%2BjjKgpkOdaKyMti8DeLiLLcO6GxjHJTguc47Q-P80g%40mail.gmail.com >>> <https://groups.google.com/d/msgid/django-developers/CAHx8S1tk%2BjjKgpkOdaKyMti8DeLiLLcO6GxjHJTguc47Q-P80g%40mail.gmail.com?utm_medium=email&utm_source=footer> >>> . >>> >> -- > You received this message because you are subscribed to the Google Groups > "Django developers (Contributions to Django itself)" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to django-developers+unsubscr...@googlegroups.com. > To view this discussion on the web visit > https://groups.google.com/d/msgid/django-developers/bdc64ced-1f75-414a-9e44-1c7fa606f5f9n%40googlegroups.com > <https://groups.google.com/d/msgid/django-developers/bdc64ced-1f75-414a-9e44-1c7fa606f5f9n%40googlegroups.com?utm_medium=email&utm_source=footer> > . > -- Adam -- You received this message because you are subscribed to the Google Groups "Django developers (Contributions to Django itself)" group. To unsubscribe from this group and stop receiving emails from it, send an email to django-developers+unsubscr...@googlegroups.com. To view this discussion on the web visit https://groups.google.com/d/msgid/django-developers/CAMyDDM3KxpC2XgOYaeW%2Bz%2BJXwOMWwEZkUtSxGuwUBCBuxBvKaA%40mail.gmail.com.