Re: [Python-Dev] Hash collision security issue (now public)

2012-01-09 Thread Victor Stinner
> That said, I don't think smallest-format is actually enforced with > anything stronger than comments (such as in unicodeobject.h struct > PyASCIIObject) and asserts (mostly calling > _PyUnicode_CheckConsistency).  I don't have any insight on how > prevalent non-conforming strings will be in pract

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-09 Thread Stefan Behnel
Jim Jewett, 08.01.2012 23:33: > Stefan Behnel wrote: >> Admittedly, this may require some adaptation for the PEP393 unicode memory >> layout in order to produce identical hashes for all three representations >> if they represent the same content. > > They SHOULD NOT represent the same content; com

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-08 Thread Brian Curtin
On Sun, Jan 8, 2012 at 16:33, Jim Jewett wrote: > In http://mail.python.org/pipermail/python-dev/2012-January/115368.html > Stefan Behnel wrote: Can you please configure your mail client to not create new threads like this? As if this topic wasn't already hard enough to follow, it now exists acro

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-07 Thread Terry Reedy
On 1/7/2012 12:57 PM, Christian Heimes wrote: Am 07.01.2012 12:02, schrieb Stefan Behnel: Admittedly, this may require some adaptation for the PEP393 unicode memory layout in order to produce identical hashes for all three representations if they represent the same content. So it's not a drop-

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-07 Thread Christian Heimes
Am 07.01.2012 12:02, schrieb Stefan Behnel: > Wouldn't Bob Jenkins' "lookup3" hash function fit in here? After all, it's > portable, known to provide a very good distribution for different string > values and is generally fast on both 32 and 64 bit architectures. > > http://burtleburtle.net/bob/c/

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-07 Thread Stefan Behnel
Christian Heimes, 31.12.2011 04:59: > Am 31.12.2011 03:22, schrieb Victor Stinner: > The unique structure of CPython's dict implementation makes it harder to > get the number of values with equal hash. The academic hash map (the one > I learnt about at university) uses a bucket to store all element

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Glenn Linderman
On 1/5/2012 4:10 PM, Nick Coghlan wrote: On Fri, Jan 6, 2012 at 8:15 AM, Serhiy Storchaka wrote: 05.01.12 21:14, Glenn Linderman написав(ла): So, fixing the vulnerable packages could be a sufficient response, rather than changing the hash function. How to fix? Each of those above allocates a

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Paul Moore
On 6 January 2012 20:25, Mark Shannon wrote: > Hi, > > It seems to me that half the folk discussing this issue want a super-strong, > resist-all-hypothetical-attacks hash with little regard to performance. The > other half want no change or a change that will have no  observable effect. > (I may b

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Mark Shannon
Hi, It seems to me that half the folk discussing this issue want a super-strong, resist-all-hypothetical-attacks hash with little regard to performance. The other half want no change or a change that will have no observable effect. (I may be exaggerating a little.) Can I propose the followi

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Victor Stinner
Using my patch (random-2.patch), the overhead is 0%. I cannot see a difference with and without my patch. Numbers: --- unpatched: == 3 characters == 1 loops, best of 3: 459 usec per loop == 10 characters == 1 loops, best of 3: 575 usec per loop == 500 characters == 1 loops, best of 3: 1.36 msec pe

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Steven D'Aprano
Glenn Linderman wrote: On 1/5/2012 5:52 PM, Steven D'Aprano wrote: At some point, presuming that there is no speed penalty, the behaviour will surely become not just enabled by default but mandatory. Python has never promised that hashes must be predictable or consistent, so apart from backw

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-06 Thread Mark Shannon
Serhiy Storchaka wrote: 06.01.12 02:10, Nick Coghlan написав(ла): Not a good idea - a lot of the 3rd party tests that depend on dict ordering are going to be using those modules anyway, so scattering our solution across half the standard library is needlessly creating additional work without rea

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Serhiy Storchaka
06.01.12 02:10, Nick Coghlan написав(ла): Not a good idea - a lot of the 3rd party tests that depend on dict ordering are going to be using those modules anyway, so scattering our solution across half the standard library is needlessly creating additional work without really reducing the incompat

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Glenn Linderman
On 1/5/2012 5:52 PM, Steven D'Aprano wrote: At some point, presuming that there is no speed penalty, the behaviour will surely become not just enabled by default but mandatory. Python has never promised that hashes must be predictable or consistent, so apart from backwards compatibility conce

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Steven D'Aprano
Benjamin Peterson wrote: 2012/1/5 Steven D'Aprano : [...] There's nothing obscure about directly testing the hash. That's about as far from obscure as it is possible to get: you are directly testing the presence of a feature by testing the feature. It's obscure because hash('') != 0 doesn't n

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Christian Heimes
Am 06.01.2012 03:04, schrieb Benjamin Peterson: > It's obscure because hash('') != 0 doesn't necessarily mean the hashes > are randomized. A different hashing algorithm could be in effect. Also in 1 of 2**32 or 2**64 tries hash('') is 0 although randomizing is active. Christian __

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Benjamin Peterson
2012/1/5 Steven D'Aprano : > Benjamin Peterson wrote: >> >> 2012/1/5 Nick Coghlan : >>> >>> On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano >>> wrote: Surely the way to verify the behaviour is to run this from the shell: python -c print(hash("abcde")) twice, and see

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Steven D'Aprano
Benjamin Peterson wrote: 2012/1/5 Nick Coghlan : On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano wrote: Surely the way to verify the behaviour is to run this from the shell: python -c print(hash("abcde")) twice, and see that the calls return different values. (Or have I misunderstood the wa

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Nick Coghlan
On Fri, Jan 6, 2012 at 10:59 AM, Benjamin Peterson wrote: > What exactly is the disadvantage of a sys attribute? That would seem > preferable to an obscure incarnation like that. Adding sys attributes in maintenance (or security) releases makes me nervous. However, Victor and Christian are right

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Antoine Pitrou
On Fri, 06 Jan 2012 01:50:00 +0100 Christian Heimes wrote: > Am 06.01.2012 01:34, schrieb Nick Coghlan: > > On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano > > wrote: > >> Surely the way to verify the behaviour is to run this from the shell: > >> > >> python -c print(hash("abcde")) > >> > >> tw

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Benjamin Peterson
2012/1/5 Nick Coghlan : > On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano wrote: >> Surely the way to verify the behaviour is to run this from the shell: >> >> python -c print(hash("abcde")) >> >> twice, and see that the calls return different values. (Or have I >> misunderstood the way the fix i

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Christian Heimes
Am 06.01.2012 01:34, schrieb Nick Coghlan: > On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano wrote: >> Surely the way to verify the behaviour is to run this from the shell: >> >> python -c print(hash("abcde")) >> >> twice, and see that the calls return different values. (Or have I >> misunderstoo

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Victor Stinner
2012/1/6 Barry Warsaw : >>Settings for PYRANDOMHASH: >> >> PYRANDOMHASH=1 >>   enable randomized hashing function >> >> PYRANDOMHASH=/path/to/seed >>   enable randomized hashing function and read seed from 'seed' >> >> PYRANDOMHASH=0 >>   disable randomed hashing function > > Why not PYTHONHASHSEED

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Nick Coghlan
On Fri, Jan 6, 2012 at 10:07 AM, Steven D'Aprano wrote: > Surely the way to verify the behaviour is to run this from the shell: > > python -c print(hash("abcde")) > > twice, and see that the calls return different values. (Or have I > misunderstood the way the fix is going to work?) > > In any cas

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Barry Warsaw
On Jan 05, 2012, at 10:40 PM, Christian Heimes wrote: >Hey Barry, stop stealing my ideas! :) I've argued for these default >settings for days. :) >I've suggested the env var PYRANDOMHASH. It's easy to set env vars in >Apache. For example Debian/Ubuntu has /etc/apache2/envvars. For consistency,

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Steven D'Aprano
David Malcolm wrote: When backporting the fix to ancient python versions, I'm inclined to turn the change *off* by default, requiring the change to be enabled via an environment variable: I want to avoid breaking existing code, even if such code is technically relying on non-guaranteed behavior.

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Terry Reedy
On 1/5/2012 3:10 PM, Ethan Furman wrote: Tres Seaver wrote: 1) the security problem is not in CPython, but rather in web servers that use dict inappropriately. Most webapp vulnerabilities are due to their use of Python's cgi module, which it uses a dict to hold the form / query string data b

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Nick Coghlan
On Fri, Jan 6, 2012 at 8:15 AM, Serhiy Storchaka wrote: > 05.01.12 21:14, Glenn Linderman написав(ла): >> >> So, fixing the vulnerable packages could be a sufficient response, >> rather than changing the hash function.  How to fix?  Each of those >> above allocates and returns a dict.  Simply have

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Serhiy Storchaka
05.01.12 21:14, Glenn Linderman написав(ла): So, fixing the vulnerable packages could be a sufficient response, rather than changing the hash function. How to fix? Each of those above allocates and returns a dict. Simply have each of those allocate and return and wrapped dict, which has the fo

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Christian Heimes
Am 05.01.2012 22:59, schrieb Antoine Pitrou: > I don't think we (python-dev) are really concerned with 2.3, 2.4, > 2.5 and 3.0. They're all unsupported, and people do what they want > with their local source trees. Let me reply with a quote from Barry: > Correct, although there's no reason why a

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Antoine Pitrou
On Thu, 05 Jan 2012 22:40:58 +0100 Christian Heimes wrote: > Am 05.01.2012 21:45, schrieb Barry Warsaw: > > This sounds like a reasonable compromise for all stable Python releases. It > > can be turned on by default for Python 3.3. If you also make the default > > setting easy to change (i.e. pa

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Christian Heimes
Am 05.01.2012 21:10, schrieb Ethan Furman: > Tres Seaver wrote: >> -BEGIN PGP SIGNED MESSAGE- >> Hash: SHA1 >> >> On 01/05/2012 02:14 PM, Glenn Linderman wrote: >>> 1) the security problem is not in CPython, but rather in web servers >>> that use dict inappropriately. >> >> Most webapp vuln

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Christian Heimes
Am 05.01.2012 21:45, schrieb Barry Warsaw: > This sounds like a reasonable compromise for all stable Python releases. It > can be turned on by default for Python 3.3. If you also make the default > setting easy to change (i.e. parameterized in one place), then distros can > make their own decisio

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Georg Brandl
On 01/05/2012 09:45 PM, Barry Warsaw wrote: > On Jan 05, 2012, at 02:33 PM, David Malcolm wrote: > >>We have similar issues in RHEL, with the Python versions going much >>further back (e.g. 2.3) >> >>When backporting the fix to ancient python versions, I'm inclined to >>turn the change *off* by de

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread David Malcolm
On Thu, 2012-01-05 at 20:35 +, Paul Moore wrote: > On 5 January 2012 19:33, David Malcolm wrote: > > We have similar issues in RHEL, with the Python versions going much > > further back (e.g. 2.3) > > > > When backporting the fix to ancient python versions, I'm inclined to > > turn the change

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Toshio Kuratomi
On Thu, Jan 05, 2012 at 08:35:57PM +, Paul Moore wrote: > On 5 January 2012 19:33, David Malcolm wrote: > > We have similar issues in RHEL, with the Python versions going much > > further back (e.g. 2.3) > > > > When backporting the fix to ancient python versions, I'm inclined to > > turn the

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Barry Warsaw
On Jan 05, 2012, at 08:35 PM, Paul Moore wrote: >Uh, surely no-one is suggesting backporting to "ancient" versions? I >couldn't find the statement quickly on the python.org website (so this >is via google), but isn't it true that 2.6 is in security-only mode >and 2.5 and earlier will never get the

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Barry Warsaw
On Jan 05, 2012, at 02:33 PM, David Malcolm wrote: >We have similar issues in RHEL, with the Python versions going much >further back (e.g. 2.3) > >When backporting the fix to ancient python versions, I'm inclined to >turn the change *off* by default, requiring the change to be enabled via >an env

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Paul Moore
On 5 January 2012 19:33, David Malcolm wrote: > We have similar issues in RHEL, with the Python versions going much > further back (e.g. 2.3) > > When backporting the fix to ancient python versions, I'm inclined to > turn the change *off* by default, requiring the change to be enabled via > an env

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Ethan Furman
Tres Seaver wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/05/2012 02:14 PM, Glenn Linderman wrote: 1) the security problem is not in CPython, but rather in web servers that use dict inappropriately. Most webapp vulnerabilities are due to their use of Python's cgi module, which it

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Glenn Linderman
On 1/5/2012 11:49 AM, Tres Seaver wrote: -BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/05/2012 02:14 PM, Glenn Linderman wrote: 1) the security problem is not in CPython, but rather in web servers that use dict inappropriately. Most webapp vulnerabilities are due to their use of Python's

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Tres Seaver
-BEGIN PGP SIGNED MESSAGE- Hash: SHA1 On 01/05/2012 02:14 PM, Glenn Linderman wrote: > 1) the security problem is not in CPython, but rather in web servers > that use dict inappropriately. Most webapp vulnerabilities are due to their use of Python's cgi module, which it uses a dict to ho

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread David Malcolm
On Thu, 2012-01-05 at 19:34 +0200, Maciej Fijalkowski wrote: > On Thu, Jan 5, 2012 at 3:39 PM, Antoine Pitrou wrote: > > On Thu, 5 Jan 2012 15:26:27 +1100 > > Andrew Bennetts wrote: > >> > >> I don't think that's news either. > >> http://mail.python.org/pipermail/python-dev/2003-May/035907.html a

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Antoine Pitrou
On Thu, 5 Jan 2012 19:34:13 +0200 Maciej Fijalkowski wrote: > > Just to make things clear - stdlib itself has 1/64 of tests relying on > dict order. Changing dict order in *older* pythons will break > everyone's tests and some peoples code. Breaking tests is not a problem: they are typically not

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Glenn Linderman
On 1/5/2012 9:34 AM, Maciej Fijalkowski wrote: Also consider that new 2.6.x would go as a security fix to old ubuntu, but all other packages won't, because they'll not contain security fixes. Just so you know Why should CPython by constrained by broken policies of Ubuntu? If the other package

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Maciej Fijalkowski
On Thu, Jan 5, 2012 at 3:39 PM, Antoine Pitrou wrote: > On Thu, 5 Jan 2012 15:26:27 +1100 > Andrew Bennetts wrote: >> >> I don't think that's news either. >> http://mail.python.org/pipermail/python-dev/2003-May/035907.html and >> http://twistedmatrix.com/pipermail/twisted-python/2003-June/004339.

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-05 Thread Antoine Pitrou
On Thu, 5 Jan 2012 15:26:27 +1100 Andrew Bennetts wrote: > > I don't think that's news either. > http://mail.python.org/pipermail/python-dev/2003-May/035907.html and > http://twistedmatrix.com/pipermail/twisted-python/2003-June/004339.html for > instance show that in 2003 it was clearly known to

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-04 Thread Andrew Bennetts
On Wed, Jan 04, 2012 at 11:55:13AM +0100, Antoine Pitrou wrote: > On Wed, 4 Jan 2012 09:59:15 +0200 > Maciej Fijalkowski wrote: > > > > Is it *really* a security issue? We knew all along that dicts are > > O(n^2) in worst case scenario, how is this suddenly a security > > problem? > > Because it

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-04 Thread Eric Snow
On Wed, Jan 4, 2012 at 12:59 AM, Maciej Fijalkowski wrote: > On Wed, Jan 4, 2012 at 12:02 AM, Bill Janssen wrote: >> Christian Heimes wrote: >> >>> Am 29.12.2011 12:13, schrieb Mark Shannon: >>> > The attack relies on being able to predict the hash value for a given >>> > string. Randomising the

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-04 Thread Christian Heimes
Am 04.01.2012 08:59, schrieb Maciej Fijalkowski: > Is it *really* a security issue? We knew all along that dicts are > O(n^2) in worst case scenario, how is this suddenly a security > problem? For example Microsoft has released an extraordinary and unscheduled security patch for the issue between

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-04 Thread Antoine Pitrou
On Wed, 4 Jan 2012 09:59:15 +0200 Maciej Fijalkowski wrote: > > Is it *really* a security issue? We knew all along that dicts are > O(n^2) in worst case scenario, how is this suddenly a security > problem? Because it has been shown to be exploitable for malicious purposes? Regards Antoine. _

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-04 Thread Maciej Fijalkowski
On Wed, Jan 4, 2012 at 12:02 AM, Bill Janssen wrote: > Christian Heimes wrote: > >> Am 29.12.2011 12:13, schrieb Mark Shannon: >> > The attack relies on being able to predict the hash value for a given >> > string. Randomising the string hash function is quite straightforward. >> > There is no ne

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-03 Thread Terry Reedy
On 1/3/2012 5:02 PM, Bill Janssen wrote: Software that depends on an undefined hash function for synchronization and persistence deserves to break, IMO. There are plenty of well-defined hash functions available for this purpose. The doc for id() now says "This is an integer which is guarantee

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-03 Thread Bill Janssen
Christian Heimes wrote: > Am 29.12.2011 12:13, schrieb Mark Shannon: > > The attack relies on being able to predict the hash value for a given > > string. Randomising the string hash function is quite straightforward. > > There is no need to change the dictionary code. > > > > A possible (*untes

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-03 Thread Barry Warsaw
On Dec 31, 2011, at 04:56 PM, Guido van Rossum wrote: >Is there a tracker issue yet? The discussion should probably move there. I think the answer to this was "no"... until now. http://bugs.python.org/issue13703 Proposed patches should be linked to this issue now. Please nosy yourself if you w

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Barry Warsaw
On Jan 02, 2012, at 06:38 PM, Georg Brandl wrote: >I wouldn't expect too much -- they seem rather keen on cheap laughs: > >http://twitter.com/#!/bk3n/status/152068096448921600/photo/1/large Heh, so yeah, it won't be me contacting them. -Barry ___ Pytho

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Alexey Borzenkov
On Mon, Jan 2, 2012 at 10:53 PM, Alexey Borzenkov wrote: > On Mon, Jan 2, 2012 at 7:18 PM, Christian Heimes wrote: >> Am 02.01.2012 06:55, schrieb Paul McMillan: >>> I think Ruby uses FNV-1 with a salt, making it less vulnerable to >>> this. FNV is otherwise similar to our existing hash function.

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Alexey Borzenkov
On Mon, Jan 2, 2012 at 7:18 PM, Christian Heimes wrote: > Am 02.01.2012 06:55, schrieb Paul McMillan: >> I think Ruby uses FNV-1 with a salt, making it less vulnerable to >> this. FNV is otherwise similar to our existing hash function. >> >> For the record, cryptographically strong hash functions

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Guido van Rossum
On Mon, Jan 2, 2012 at 7:47 AM, Christian Heimes wrote: > Am 01.01.2012 19:45, schrieb Terry Reedy: > > On 1/1/2012 10:13 AM, Guido van Rossum wrote: > >> PS. Is the collision-generator used in the attack code open source? > > > > As I posted before, Alexander Klink and Julian Wälde gave their pr

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Georg Brandl
On 01/02/2012 04:47 PM, Christian Heimes wrote: > Am 01.01.2012 19:45, schrieb Terry Reedy: >> On 1/1/2012 10:13 AM, Guido van Rossum wrote: >>> PS. Is the collision-generator used in the attack code open source? >> >> As I posted before, Alexander Klink and Julian Wälde gave their project >> ema

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Christian Heimes
Am 01.01.2012 19:45, schrieb Terry Reedy: > On 1/1/2012 10:13 AM, Guido van Rossum wrote: >> PS. Is the collision-generator used in the attack code open source? > > As I posted before, Alexander Klink and Julian Wälde gave their project > email as hash...@alech.de. Since they indicated disappoint

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Anders J. Munch
> On 1/1/2012 12:28 PM, Christian Heimes wrote: > I understood Alexander Klink and Julian Wälde, hash...@alech.de, as > saying that they consider that using a random non-zero start value is > sufficient to make the hash non-vulnerable. Sufficient against their current attack. But will it last? F

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Christian Heimes
Am 02.01.2012 06:55, schrieb Paul McMillan: > I think Ruby uses FNV-1 with a salt, making it less vulnerable to > this. FNV is otherwise similar to our existing hash function. > > For the record, cryptographically strong hash functions are in the > neighborhood of 400% slower than our existing has

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Stephen J. Turnbull
Christian Heimes writes: > Am 31.12.2011 13:03, schrieb Stephen J. Turnbull: > > I don't know the implementation issues well enough to claim it is a > > solution, but this hasn't been mentioned before AFAICS: > > > > While the dictionary probe has to start with a hash for backward > > compat

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Antoine Pitrou
On Sun, 1 Jan 2012 21:55:52 -0800 Paul McMillan wrote: > > This is similar to the change proposed by Christian Heimes. > > Most importantly, I moved the xor with r[x % len_r] down a line. > Before, it wasn't being applied to the last character. Shouldn't it be r[i % len(r)] instead? (refer to y

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-02 Thread Terry Reedy
On 1/2/2012 12:55 AM, Paul McMillan wrote: Terry Reedy said: I understood Alexander Klink and Julian Wälde, hash...@alech.de, as saying that they consider that using a random non-zero start value is sufficient to make the hash non-vulnerable. I've been talking to them. They're happy to look at

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Paul McMillan
I fixed a couple things in my proposed algorithm: https://gist.github.com/0a91e52efa74f61858b5 I had a typo, and used 21 instead of 12 for the size multiplier. We definitely don't need 2MB random data. The initialization of r was broken. Now it is an array of ints, so there's no conversion when i

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Paul McMillan
> Different concern. What if someone were to have code implementing an > external, persistent hash table, using Python's hash() function? They might > have a way to rehash everything when a new version of Python comes along, > but they would not be happy if hash() is different in each process. I >

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Paul McMillan
> But how about precomputing the intermediate value (x)? The hash is (mostly) > doing x = f(x, c) for each c in the input. That's a fair point. If we go down that avenue, I think simply choosing a random fixed starting value for x is the correct choice, rather than computing an intermediate value.

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Terry Reedy
On 1/1/2012 12:28 PM, Christian Heimes wrote: Am 01.01.2012 17:54, schrieb Antoine Pitrou: I don't understand. FNV-1 multiplies the current running result with a prime and then xors it with the following byte. This is also what we do. (I'm assuming 103 is prime) There must be a major diffe

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Terry Reedy
On 1/1/2012 10:13 AM, Guido van Rossum wrote: PS. Is the collision-generator used in the attack code open source? As I posted before, Alexander Klink and Julian Wälde gave their project email as hash...@alech.de. Since they indicated disappointment in not hearing from Python, I presume they w

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Victor Stinner
Le 01/01/2012 04:29, Paul McMillan a écrit : This is incorrect. Once an attacker has guessed the random seed, any operation which reveals the ordering of hashed objects can be used to verify the answer. JSON responses would be ideal. In fact, an attacker can do a brute-force attack of the random

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 01:11, schrieb Guido van Rossum: > FWIW I managed to build Python 2.6, and a trivial mutation of the > string/unicode hash function (add 1 initially) made only three tests > fail; test_symtable and test_json both have a dependency on dictionary > order, test_ctypes I can't quite figur

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 17:54, schrieb Antoine Pitrou: > I don't understand. FNV-1 multiplies the current running result with a > prime and then xors it with the following byte. This is also what we do. > (I'm assuming 103 is prime) There must be a major difference somewhere inside the algorithm. The ta

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Antoine Pitrou
Le dimanche 01 janvier 2012 à 17:34 +0100, Christian Heimes a écrit : > Am 01.01.2012 17:09, schrieb Antoine Pitrou: > > On Sun, 01 Jan 2012 16:48:32 +0100 > > Christian Heimes wrote: > >> The talkers claim and have shown that it's too easy to pre-calculate > >> collisions with hashing algorithms

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 17:09, schrieb Antoine Pitrou: > On Sun, 01 Jan 2012 16:48:32 +0100 > Christian Heimes wrote: >> The talkers claim and have shown that it's too easy to pre-calculate >> collisions with hashing algorithms similar to DJBX33X / DJBX33A. It >> might be a good idea to change the hashing a

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 06:57, schrieb Paul McMillan: > I agree that doing anything is better than doing nothing. If we use > the earlier suggestion and prepend everything with a fixed-length > seed, we need quite a bit of entropy (and so a fairly long string) to > make a lasting difference. Your code at ht

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Antoine Pitrou
On Sun, 01 Jan 2012 16:56:19 +0100 Christian Heimes wrote: > Am 01.01.2012 05:11, schrieb Antoine Pitrou: > > On Sat, 31 Dec 2011 16:56:00 -0700 > > Guido van Rossum wrote: > >> ISTM the only reasonable thing is to have a random seed picked very early > >> in the process, to be used to change the

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Antoine Pitrou
On Sun, 01 Jan 2012 16:48:32 +0100 Christian Heimes wrote: > The talkers claim and have shown that it's too easy to pre-calculate > collisions with hashing algorithms similar to DJBX33X / DJBX33A. It > might be a good idea to change the hashing algorithm, too. Paul as > listed some new algorithms.

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 05:11, schrieb Antoine Pitrou: > On Sat, 31 Dec 2011 16:56:00 -0700 > Guido van Rossum wrote: >> ISTM the only reasonable thing is to have a random seed picked very early >> in the process, to be used to change the hash() function of >> str/bytes/unicode (in a way that they are still

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 00:56, schrieb Guido van Rossum: > ISTM the only reasonable thing is to have a random seed picked very > early in the process, to be used to change the hash() function of > str/bytes/unicode (in a way that they are still compatible with each other). > > The seed should be unique per

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 31.12.2011 23:38, schrieb Terry Reedy: > On 12/31/2011 4:43 PM, PJ Eby wrote: > >> Here's an idea. Suppose we add a sys.hash_seed or some such, that's >> settable to an int, and defaults to whatever we're using now. Then >> programs that want a fix can just set it to a random number, > > I d

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Christian Heimes
Am 01.01.2012 16:13, schrieb Guido van Rossum: > Different concern. What if someone were to have code implementing an > external, persistent hash table, using Python's hash() function? They > might have a way to rehash everything when a new version of Python comes > along, but they would not be hap

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Guido van Rossum
PS. Is the collision-generator used in the attack code open source? -- --Guido van Rossum (python.org/~guido) ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: http://mail.python.org/mailm

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Guido van Rossum
Different concern. What if someone were to have code implementing an external, persistent hash table, using Python's hash() function? They might have a way to rehash everything when a new version of Python comes along, but they would not be happy if hash() is different in each process. I somehow va

Re: [Python-Dev] Hash collision security issue (now public)

2012-01-01 Thread Guido van Rossum
On Sat, Dec 31, 2011 at 10:57 PM, Paul McMillan wrote: > > Still, it would represent an effort for the attacker of a much greater > > magnitude than the current attack. It's all a trade-off -- at some point > > it'll just be easier for the attacker to use some other vulnerability. > Also > > the

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Paul McMillan
> Still, it would represent an effort for the attacker of a much greater > magnitude than the current attack. It's all a trade-off -- at some point > it'll just be easier for the attacker to use some other vulnerability. Also > the class of vulnerable servers would be greatly reduced. I agree that

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Guido van Rossum
On Sat, Dec 31, 2011 at 8:29 PM, Paul McMillan wrote: > > I'm not too concerned about a 3rd party being able to guess the random > seed > > -- this would require much more effort on their part, since they would > have > > to generate a new set of colliding keys each time they think they have > >

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Guido van Rossum
On Sat, Dec 31, 2011 at 9:11 PM, Antoine Pitrou wrote: > On Sat, 31 Dec 2011 16:56:00 -0700 > Guido van Rossum wrote: > > ISTM the only reasonable thing is to have a random seed picked very early > > in the process, to be used to change the hash() function of > > str/bytes/unicode (in a way that

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Antoine Pitrou
On Sat, 31 Dec 2011 16:56:00 -0700 Guido van Rossum wrote: > ISTM the only reasonable thing is to have a random seed picked very early > in the process, to be used to change the hash() function of > str/bytes/unicode (in a way that they are still compatible with each other). Do str and bytes stil

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread martin
(Well, technically, you could use trees or some other O log n data structure as a fallback once you have too many collisions, for some value of "too many". Seems a bit wasteful for the purpose, though.) I don't think that would be wasteful. You wouldn't just use the tree for the case of too man

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Paul McMillan
> I'm not too concerned about a 3rd party being able to guess the random seed > -- this would require much more effort on their part, since they would have > to generate a new set of colliding keys each time they think they have > guessed the hash This is incorrect. Once an attacker has guessed th

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Guido van Rossum
On Sat, Dec 31, 2011 at 4:56 PM, Guido van Rossum wrote: > PS. I would propose a specific fix but I can't seem to build a working > CPython from the trunk on my laptop (OS X 10.6, Xcode 4.1). I get this > error late in the build: > > ./python.exe -SE -m sysconfig --generate-posix-vars > Fatal Pyt

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Guido van Rossum
ISTM the only reasonable thing is to have a random seed picked very early in the process, to be used to change the hash() function of str/bytes/unicode (in a way that they are still compatible with each other). The seed should be unique per process except it should survive fork() (but not exec()).

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Terry Reedy
On 12/31/2011 4:43 PM, PJ Eby wrote: Here's an idea. Suppose we add a sys.hash_seed or some such, that's settable to an int, and defaults to whatever we're using now. Then programs that want a fix can just set it to a random number, I do not think we can allow that to change once there are h

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread PJ Eby
On Sat, Dec 31, 2011 at 4:04 PM, Jeffrey Yasskin wrote: > Hash functions are already unstable across Python versions. Making > them unstable across interpreter processes (multiprocessing doesn't > share dicts, right?) doesn't sound like a big additional problem. > Users who want a distributed has

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Jeffrey Yasskin
On Wed, Dec 28, 2011 at 5:37 PM, Jesse Noller wrote: > > > On Wednesday, December 28, 2011 at 8:28 PM, Michael Foord wrote: > >> Hello all, >> >> A paper (well, presentation) has been published highlighting security >> problems with the hashing algorithm (exploiting collisions) in many >> progra

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread PJ Eby
On Sat, Dec 31, 2011 at 7:03 AM, Stephen J. Turnbull wrote: > While the dictionary probe has to start with a hash for backward > compatibility reasons, is there a reason the overflow strategy for > insertion has to be buckets containing lists? How about > double-hashing, etc? > This won't help,

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread martin
Zitat von Victor Stinner : The current implementation of dict uses a linked-list. [...] Tell me if I am wrong. At least with this statement, you are wrong: the current implementation does *not* use a linked-list. Instead, it uses open addressing. Regards, Martin

Re: [Python-Dev] Hash collision security issue (now public)

2011-12-31 Thread Christian Heimes
Am 31.12.2011 13:03, schrieb Stephen J. Turnbull: > I don't know the implementation issues well enough to claim it is a > solution, but this hasn't been mentioned before AFAICS: > > While the dictionary probe has to start with a hash for backward > compatibility reasons, is there a reason the over

  1   2   >