Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-20 Thread Amaury Forgeot d'Arc
2012/1/19 Gregory P. Smith > str[-1] is not likely to work if you want to maintain ABI compatibility. > Appending it to the data after the terminating \0 is more likely to be > possible, but if there is any possibility that existing compiled extension > modules have somehow inlined code to do al

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-20 Thread Hrvoje Niksic
On 01/18/2012 06:55 PM, "Martin v. Löwis" wrote: I was thinking about adding the field at the end, Will this make all strings larger, or only those that create dict collisions? Making all strings larger to fix this issue sounds like a really bad idea. Also, would it be acceptable to simply

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-19 Thread Gregory P. Smith
On Wed, Jan 18, 2012 at 9:55 AM, "Martin v. Löwis" wrote: > Am 18.01.2012 17:01, schrieb PJ Eby: > > On Tue, Jan 17, 2012 at 7:58 PM, "Martin v. Löwis" > > wrote: > > > > Am 17.01.2012 22:26, schrieb Antoine Pitrou: > > > Only 2 bits are used in ob_sstate, meani

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-19 Thread PJ Eby
On Jan 18, 2012 12:55 PM, Martin v. Löwis wrote: > > Am 18.01.2012 17:01, schrieb PJ Eby: > > On Tue, Jan 17, 2012 at 7:58 PM, "Martin v. Löwis" > > wrote: > > > > Am 17.01.2012 22:26, schrieb Antoine Pitrou: > > > Only 2 bits are used in ob_sstate, meaning 30 a

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-18 Thread Glenn Linderman
On 1/18/2012 9:52 AM, "Martin v. Löwis" wrote: I've been seriously considering implementing a balanced tree inside the dict (again for string-only dicts, as ordering can't be guaranteed otherwise). However, this would be a lot of code for a security fix. It*would* solve the issue for good, thoug

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-18 Thread Martin v. Löwis
Am 18.01.2012 13:30, schrieb Barry Warsaw: > On Jan 18, 2012, at 08:19 AM, Martin v. Löwis wrote: > >> My concern is not about breaking doctests: this proposal will also break >> them. My concern is about applications that assume that hash(s) is >> stable across runs, and we do have reports that i

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-18 Thread Martin v. Löwis
Am 18.01.2012 17:01, schrieb PJ Eby: > On Tue, Jan 17, 2012 at 7:58 PM, "Martin v. Löwis" > wrote: > > Am 17.01.2012 22:26, schrieb Antoine Pitrou: > > Only 2 bits are used in ob_sstate, meaning 30 are left. These 30 bits > > could cache a "hash perturbation

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-18 Thread PJ Eby
On Tue, Jan 17, 2012 at 7:58 PM, "Martin v. Löwis" wrote: > Am 17.01.2012 22:26, schrieb Antoine Pitrou: > > Only 2 bits are used in ob_sstate, meaning 30 are left. These 30 bits > > could cache a "hash perturbation" computed from the string and the > > random bits: > > > > - hash() would use ob_s

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-18 Thread Barry Warsaw
On Jan 18, 2012, at 08:19 AM, Martin v. Löwis wrote: >My concern is not about breaking doctests: this proposal will also break >them. My concern is about applications that assume that hash(s) is >stable across runs, and we do have reports that it will break >applications. I am a proponent of doct

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Martin v. Löwis
> +1 Absolutely. We can and should make 3.3 change hashes across runs > (behavior that can be disabled via a flag or environment variable). > > I think the issue of doctests and such breaking even in 2.7 due to hash > order changes is a being overblown. Code like that has already needs to > fix

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Gregory P. Smith
On Tue, Jan 17, 2012 at 12:59 PM, "Martin v. Löwis" wrote: > I'd like to propose a different approach to seeding the string hashes: > only do so for dictionaries involving only strings, and leave the > tp_hash slot of strings unchanged. > > Each string would get two hashes: the "public" hash, whic

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Martin v. Löwis
Am 17.01.2012 22:26, schrieb Antoine Pitrou: > On Tue, 17 Jan 2012 21:59:28 +0100 > "Martin v. Löwis" wrote: >> I'd like to propose a different approach to seeding the string hashes: >> only do so for dictionaries involving only strings, and leave the >> tp_hash slot of strings unchanged. > > I t

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread martin
Zitat von Victor Stinner : Each string would get two hashes: the "public" hash, which is constant across runs and bugfix releases, and the dict-hash, which is only used by the dictionary implementation, and only if all keys to the dict are strings. The distinction between secret (private, sec

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Victor Stinner
> Each string would get two hashes: the "public" hash, which is constant > across runs and bugfix releases, and the dict-hash, which is only used > by the dictionary implementation, and only if all keys to the dict are > strings. The distinction between secret (private, secure) and "public" hash (

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Victor Stinner
> There is a simpler solution: > > bucket_index = (hash(str) ^ secret) & DICT_MASK. Oops, hash^secret doesn't add any security. Victor ___ Python-Dev mailing list Python-Dev@python.org http://mail.python.org/mailman/listinfo/python-dev Unsubscribe: htt

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Victor Stinner
2012/1/17 "Martin v. Löwis" : > I'd like to propose a different approach to seeding the string hashes: > only do so for dictionaries involving only strings, and leave the > tp_hash slot of strings unchanged. The real problem is in dict (or any structure using an hash table), so if it is possible,

Re: [Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Antoine Pitrou
On Tue, 17 Jan 2012 21:59:28 +0100 "Martin v. Löwis" wrote: > I'd like to propose a different approach to seeding the string hashes: > only do so for dictionaries involving only strings, and leave the > tp_hash slot of strings unchanged. I think Python 3 would be better with a clean fix (all hash

[Python-Dev] Hashing proposal: change only string-only dicts

2012-01-17 Thread Martin v. Löwis
I'd like to propose a different approach to seeding the string hashes: only do so for dictionaries involving only strings, and leave the tp_hash slot of strings unchanged. Each string would get two hashes: the "public" hash, which is constant across runs and bugfix releases, and the dict-hash, whi