[issue13703] Hash collision security issue

2012-01-03 Thread Paul McMillan
Paul McMillan added the comment: I agree that we should enable randomness by default, and provide an easy way for users to disable it if necessary (unit test suites that explicitly depend on order being an obvious candidate). I'll link my proposed algorithm change here, for the record:

[issue13703] Hash collision security issue

2012-01-03 Thread Paul McMillan
Paul McMillan added the comment: A couple of things here: First, my proposed change is not cryptographically secure. There simply aren't any cryptographic hashing algorithms available that are in the performance class we need. My proposal does make the collision attack quite difficu

[issue13703] Hash collision security issue

2012-01-04 Thread Paul McMillan
Paul McMillan added the comment: This is not something that can be fixed by limiting the size of POST/GET. Parsing documents (even offline) can generate these problems. I can create books that calibre (a Python-based ebook format shifting tool) can't convert, but are otherwise perf

[issue13703] Hash collision security issue

2012-01-04 Thread Paul McMillan
Paul McMillan added the comment: > My proposition only adds two XOR to hash(str) (outside the loop on Unicode > characters), so I expect a ridiculous overhead. I don't know yet how hard it > is to guess the secret from hash(str) output. It doesn't work much better than a

[issue13703] Hash collision security issue

2012-01-04 Thread Paul McMillan
Paul McMillan added the comment: > - for small strings we could use a different seed than for larger strings Or just leave them unseeded with our existing algorithm. Shifting them into a different part of the hash space doesn't really gain us much. > - for larger strings we could

[issue13703] Hash collision security issue

2012-01-05 Thread Paul McMillan
Paul McMillan added the comment: Marc-Andre: Victor already pasted the relevant part of my code: http://bugs.python.org/issue13703#msg150568 The link to the fuller version, with revision history and a copy of the code before I modified it is here: https://gist.github.com/0a91e52efa74f61858b5

[issue13703] Hash collision security issue

2012-01-05 Thread Paul McMillan
Paul McMillan added the comment: As Alex said, Java has refused to fix the issue. I believe that Ruby 1.9 (at least the master branch code that I looked at) is using murmurhash2 with a random seed. In either case, yes, these functions are vulnerable to a number of attacks. We're solvin

[issue13703] Hash collision security issue

2012-01-06 Thread Paul McMillan
Paul McMillan added the comment: > Those who use or advocate a simple randomized starting hash (Perl, Ruby, > perhaps MS, and the CCC presenters) are presuming that the randomized hash > values are kept private. Indeed, they should be (and the docs could note > this) unless an

[issue13703] Hash collision security issue

2012-01-06 Thread Paul McMillan
Paul McMillan added the comment: > An attack can be based on trying to find many objects with the same > hash value, or trying to find many objects that, as they get inserted > into a dictionary, very often cause collisions due to the collision > resolution algorithm not finding

[issue13703] Hash collision security issue

2012-01-07 Thread Paul McMillan
Paul McMillan added the comment: > Alex, I agree the issue has to do with the origin of the data, but the > modules listed are the ones that deal with the data supplied by this > particular attack. They deal directly with the data. Do any of them pass the data further, or does the

[issue13703] Hash collision security issue

2012-01-08 Thread Paul McMillan
Paul McMillan added the comment: > Christian Heimes added the comment: > Ouch, the startup impact is large! Have we reached a point where "one size > fits all" doesn't work any longer? It's getting harder to have just one > executable for 500ms scripts and serv

[issue13703] Hash collision security issue

2012-01-21 Thread Paul McMillan
Paul McMillan added the comment: On Sat, Jan 21, 2012 at 3:47 PM, Alex Gaynor wrote: > I'm able to put N pieces of data into the database on successive requests, > but then *rendering* that data puts it in a dictionary, which renders that > page unviewable by anyone. This an

[issue13703] Hash collision security issue

2012-01-23 Thread Paul McMillan
Paul McMillan added the comment: > I think you're asking a bit much here :-) A broken app is a broken > app, no matter how nice Python tries to work around it. If an > app puts too much trust into user data, it will be vulnerable > one way or another and regardless of how the

[issue14297] Custom string formatter doesn't work like builtin str.format

2012-03-13 Thread Paul McMillan
New submission from Paul McMillan : In Python 2.7, I can use an empty format string to indicate automatically numbered positional arguments when formatting the builtin string object. http://docs.python.org/release/2.7/library/string.html#format-string-syntax If I subclass string.Formatter

[issue23505] Urlparse insufficient validation leads to open redirect

2015-03-03 Thread Paul McMillan
Paul McMillan added the comment: While some websites may use urlunparse(urlparse(url)) to validate a url, this is far from standard practice. Django, for instance, does not use this method. While I agree we should clean this behavior up, this is not a vulnerability in core python, and we need

[issue23505] Urlparse insufficient validation leads to open redirect

2015-04-07 Thread Paul McMillan
Paul McMillan added the comment: As Martin said, backporting a change for this wouldn't be appropriate for Python 2.7. The 2.7 docs could certainly be updated to make this clearer, but we can't introduce a breaking change like that to the stab