Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-13 Thread Frank Sievertsen
Am 13.01.2012 02:24, schrieb Victor Stinner: My patch doesn't fix the DoS, it just make the attack more complex. The attacker cannot pregenerate data for an attack: (s)he has first to compute the hash secret, and then compute hash collisions using the secret. The hash secret is a least 64 bits lo

Re: [Python-Dev] Status of the fix for the hash collision vulnerability

2012-01-13 Thread Frank Sievertsen
Unfortunately it requires only a few seconds to compute enough 32bit collisions on one core with no precomputed data. Are you running the hash function "backward" to generate strings with the same value, or you are more trying something like brute forcing? If you try it brute force to hit a s

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Frank Sievertsen
The main issue with that approach is that it allows a new kind of attack. Indeed, I posted another example: http://bugs.python.org/msg151677 This kind of fix can be used in a specific application or maybe in a special-purpose framework, but not on the level of a general-purpose language. Frank

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Frank Sievertsen
No, that's not true. Whenever a collision happens, other bits are mixed in very fast. Frank Am 20.01.2012 13:08, schrieb Victor Stinner: I'm surprised we haven't seen bug reports about it from users of 64-bit Pythons long ago A Python dictionary only uses the lower bits of a hash value. If you

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Frank Sievertsen
Hello, I still see at least two ways to create a DOS attack even with the collison-counting-patch. I assumed that it's possible to send ~500KB of payload to the application. 1. It's fully deterministic which slots the dict will lookup. Since we don't count slot-collisions, but only hash-value-c

Re: [Python-Dev] Counting collisions for the win

2012-01-20 Thread Frank Sievertsen
Am 20.01.2012 16:33, schrieb Guido van Rossum: (I'm thinking that the original attack is trivial once the set of 65000 colliding keys is public knowledge, which must be only a matter of time. I think it's very likely that this will happen soon. For ASP and PHP there is attack-payload publicl

Re: [Python-Dev] Counting collisions for the win

2012-01-23 Thread Frank Sievertsen
Hello, I'd still prefer to see a randomized hash()-function (at least for 3.3). But to protect against the attacks it would be sufficient to use randomization for collision resolution in dicts (and sets). What if we use a second (randomized) hash-function in case there are many collisions in ON

Re: [Python-Dev] Counting collisions for the win

2012-01-23 Thread Frank Sievertsen
On 23.01.2012 19:25, Glenn Linderman wrote: So this sounds like SafeDict, but putting it under the covers and automatically converting from dict to SafeDict after a collision threshold has been reached. Let's call it fallback-dict. and costs: * converting the dict from one hash to the othe

Re: [Python-Dev] Counting collisions for the win

2012-01-24 Thread Frank Sievertsen
Interesting idea, and I see it would avoid conversions. What happens if the data area also removed from the hash? So you enter 20 colliding keys, then 20 more that get randomized, then delete the 18 of the first 20. Can you still find the second 20 keys? Takes two sets of probes, somehow?

Re: [Python-Dev] Hashing proposal: 64-bit hash

2012-01-27 Thread Frank Sievertsen
As already mentioned, the vulnerability of 64-bit Python rather theoretical and not practical. The size of the hash makes the attack is extremely unlikely. Unfortunately this assumption is not correct. It works very good with 64bit-hashing. It's much harder to create (efficiently) 64-bit has

Re: [Python-Dev] Hashing proposal: 64-bit hash

2012-01-28 Thread Frank Sievertsen
The point is not the length of the string, but the size of string space for inspection. To search for a string with a specified 64-bit hash to iterate over 2 ** 64 strings. Spending on a single string scan 1 nanosecond (a very optimistic estimate), it would take 2 ** 64 / 1e9 / (3600 * 24 *