Re: [Python-Dev] Counting collisions for the win

2012-01-24 Thread Frank Sievertsen


Interesting idea, and I see it would avoid conversions.  What happens 
if the data area also removed from the hash?  So you enter 20 
colliding keys, then 20 more that get randomized, then delete the 18 
of the first 20.  Can you still find the second 20 keys? Takes two 
sets of probes, somehow?



That's no problem, because the dict doesn't really free a slot, it
replaces the values with a dummy-values.

These places are later reused for new values or the whole dict is 
recreated and

resized.

Frank
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Sprinting at PyCon US

2012-01-24 Thread Brett Cannon
I went ahead  and signed us up as usual:
https://us.pycon.org/2012/community/sprints/projects/ . I listed myself as
the leader, but I will only be at the sprints one full day and whatever
part of Tuesday I can fit in before flying out to Toronto (which is
probably not much thanks to the timezone difference). So if someone wants
to be the official leader who will be there longer feel free to take me off
and put yourself in (and you don't need to ask me beforehand).
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] devguide: Use -j0 to maximimze parallel execution.

2012-01-24 Thread Georg Brandl
Am 24.01.2012 18:58, schrieb brett.cannon:
> http://hg.python.org/devguide/rev/a34e4a6b89dc
> changeset:   489:a34e4a6b89dc
> user:Brett Cannon 
> date:Tue Jan 24 12:58:01 2012 -0500
> summary:
>   Use -j0 to maximimze parallel execution.
> 
> files:
>   runtests.rst |  2 +-
>   1 files changed, 1 insertions(+), 1 deletions(-)
> 
> 
> diff --git a/runtests.rst b/runtests.rst
> --- a/runtests.rst
> +++ b/runtests.rst
> @@ -41,7 +41,7 @@
>  If you have a multi-core or multi-CPU machine, you can enable parallel 
> testing
>  using several Python processes so as to speed up things::
>  
> -   ./python -m test -j2
> +   ./python -m test -j0

That only works on 3.3 though...

Georg


___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] devguide: Use -j0 to maximimze parallel execution.

2012-01-24 Thread Brett Cannon
On Tue, Jan 24, 2012 at 13:52, Georg Brandl  wrote:

> Am 24.01.2012 18:58, schrieb brett.cannon:
> > http://hg.python.org/devguide/rev/a34e4a6b89dc
> > changeset:   489:a34e4a6b89dc
> > user:Brett Cannon 
> > date:Tue Jan 24 12:58:01 2012 -0500
> > summary:
> >   Use -j0 to maximimze parallel execution.
> >
> > files:
> >   runtests.rst |  2 +-
> >   1 files changed, 1 insertions(+), 1 deletions(-)
> >
> >
> > diff --git a/runtests.rst b/runtests.rst
> > --- a/runtests.rst
> > +++ b/runtests.rst
> > @@ -41,7 +41,7 @@
> >  If you have a multi-core or multi-CPU machine, you can enable parallel
> testing
> >  using several Python processes so as to speed up things::
> >
> > -   ./python -m test -j2
> > +   ./python -m test -j0
>
> That only works on 3.3 though...
>

Bugger. I will add a note.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Packaging and setuptools compatibility

2012-01-24 Thread Alexis Métaireau

Hi folks,

I have this in my mind since a long time, but I didn't talked about that 
on this list, was only writing on distutils@ or another list we had for 
distutils2 (the fellowship of packaging).


AFAIK, we're almost good about packaging in python 3.3, but there is 
still something that keeps bogging me. What we've done (I worked 
especially on this bit) is to provide a compatibility layer for the 
distributions packaged using setuptools/distribute. What it does, 
basically, is to install things using setuptools or distribute (the one 
present with the system) and then convert the metadata to the new one 
described in PEP 345.


A few things are not handled yet, regarding setuptools: entrypoints and 
namespaces. I would like to espeicially talk about entrypoints here.


Entrypoints basically are a plugin system. They are storing information 
in the metadata and then retrieving them when needing them. The problem 
with this, as everything when trying to get information from metadata is 
that we need to parse all the metadata for all the installed 
distributions. (say O(N)).


I'm wondering if we should support that (a way to have plugins) in the 
new packaging thing, or not. If not, this mean we should come with 
another solution to support this outside of packaging (may be in 
distribute). If yes, then we should design it, and probably make it a 
sub-part of packaging.


What are your opinions on that? Should we do it or not? and if yes, 
what's the way to go?


-- Alexis
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Packaging and setuptools compatibility

2012-01-24 Thread Glyph Lefkowitz
On Jan 24, 2012, at 12:54 PM, Alexis Métaireau wrote:

> I'm wondering if we should support that (a way to have plugins) in the new 
> packaging thing, or not. If not, this mean we should come with another 
> solution to support this outside of packaging (may be in distribute). If yes, 
> then we should design it, and probably make it a sub-part of packaging.

First, my interest: Twisted has its own plugin system.  I would like this to 
continue to work in the future.

I do not believe that packaging should support plugins directly.  Run-time 
metadata is not the packaging system's job.  However, the packaging system does 
need to provide some guarantees about how to install and update data at 
installation (and post-installation time) so that databases of plugin metadata 
may be kept up to date.  Basically, packaging's job is constructing explicitly 
declared parallels between your development environment and your deployment 
environment.

Some such databases are outside of Python entirely (for example, you might 
think of /etc/init.d as such a database), so even if you don't care about the 
future of Twisted's weirdo plugin system, it would be nice for this to be 
supported.

In other words, packaging should have a meta-plugin system: a way for a plugin 
system to register itself and provide an API for things to install their 
metadata, and a way to query the packaging module about the way that a Python 
package is installed so that it can put things near to it in an appropriate 
way.  (Keep in mind that "near to it" may mean in a filesystem directory, or a 
zip file, or stuffed inside a bundle or executable.)

In my design of Twisted's plugin system, we used PEP 302 as this sort of 
meta-standard, and (modulo certain bugs in easy_install and pip, most of which 
are apparently getting fixed in pip pretty soon) it worked out reasonably well. 
 The big missing pieces are post-install and post-uninstall hooks.  If we had 
those, translating to "native" packages for Twisted (and for things that use 
it) could be made totally automatic.

-glyph___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] Status of Mac buildbots

2012-01-24 Thread Nadeem Vawda
Hi all,

I've noticed that most of the Mac buildbots have been offline for a while:

* http://www.python.org/dev/buildbot/all/buildslaves/parc-snowleopard-1
* http://www.python.org/dev/buildbot/all/buildslaves/parc-tiger-1
* http://www.python.org/dev/buildbot/all/buildslaves/parc-leopard-1

Does anyone know what the status of these bots is? Are they
permanently down, or just temporarily inaccessible?

Cheers,
Nadeem
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Counting collisions w/ no need for a fatal exception

2012-01-24 Thread Gregory P. Smith
On Sun, Jan 22, 2012 at 10:41 PM, Tim Delaney
 wrote:
> On 23 January 2012 16:49, Lennart Regebro  wrote:
>>
>> On Mon, Jan 23, 2012 at 06:02, Paul McMillan  wrote:
>> >> We may use a different salt per dictionary.
>> >
>> > If we're willing to re-hash everything on a per-dictionary basis. That
>> > doesn't seem reasonable given our existing usage.
>>
>> Well, if we get crazy amounts of collisions, re-hashing with a new
>> salt to get rid of those collisions seems quite reasonable to me...
>
>
> Actually, this looks like it has the seed of a solution in it. I haven't
> scrutinised the following beyond "it sounds like it could work" - it could
> well contain nasty flaws.
>
> Assumption: We only get an excessive number of collisions during an attack
> (directly or indirectly).
> Assumption: Introducing a salt into hashes will change those hashes
> sufficiently to mitigate the attack (all discussion of randomising hashes
> makes this assumption).
>
> 1. Keep the current hashing (for all dictionaries) i.e. just using
> hash(key).
>
> 2. Count collisions.
>
> 3. If any key hits X collisions change that dictionary to use a random salt
> for hashes (at least for str and unicode keys). This salt would be
> remembered for the dictionary.
>
> Consequence: The dictionary would need to be rebuilt when an attack was
> detected.
> Consequence: Hash caching would no longer occur for this dictionary, making
> most operations more expensive.
> Consequence: Anything relying on the iteration order of a dictionary which
> has suffered excessive conflicts would fail.

+1

I like this!  The dictionary would still be O(n) but the constant cost
in front of that just went up.  When you are dealing with keys coming
in from outside of the process, those are unlikely to already have any
hash values so the constant cost at insertion time has really not
changed at all because they would need hashing anyways. Their cost at
non-iteration lookup time will be a constant factor greater but I do
not see that as being a problem given that known keys being looked up
in a

This approach also allows for the dictionary hashing mode switch to
occur after a lower number of collisions than the previous
investigations into raising a MemoryError or similar were asking for
(because they wanted to avoid false hard failures).  It prevents that
case from breaking in favor of a brief performance hiccup.

I would *combine* this with a per process/interpreter-instance seed in
3.3 and later for added impact (less need for this code path to ever
be triggered).  For the purposes of backporting as a security fix,
that part would be disabled by default but #1-3 would be enabled by
default.

Question A: Does the dictionary get rebuilt -again- with a new
dict-salt if a large number of collisions occurs after a dict-salt has
already been established?

Question B: Is there a size of dictionary in which we refuse to
rebuild & rehash it because it would simply be too costly?  obviously
if we lack the ram to malloc a new table, when else?  ever?

Suggestion: Would there be any benefit to making the number of
collisions threshold on when to rebuild & rehash a log function of the
dictionary's current size rather than a constant for all dicts?

>
> 4. (Optional) in 3.3, provide a way to get a dictionary with random salt
> (i.e. not wait for attack).

I don't like #4 as a documented public API as I'm not sure how well
that'd play with other VMs (I suppose they could ignore it) but it
would be useful for dict implementation testing purposes and easier
studying of the behavior.  If this is added it should be a method on
the dict such as ._set_hash_salt() or something and for testing
purposes it would be good to allow a dictionary to be queried to see
if they are using their own salt or not (perhaps just
._get_hash_salt() returning non 0 means they are?)

-gps
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


[Python-Dev] io module types

2012-01-24 Thread Matt Joiner
Can calls to the C types in the io module be made into module lookups
more akin to how it would work were it written in Python? The C
implementation for io_open invokes the C type objects for FileIO, and
friends, instead of looking them up on the io or _io modules. This
makes it difficult to subclass and/or modify the behaviour of those
classes from Python.

http://hg.python.org/cpython/file/0bec943f6778/Modules/_io/_iomodule.c#l413
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com


Re: [Python-Dev] Coroutines and PEP 380

2012-01-24 Thread Matt Joiner
After much consideration, and playing with PEP380, I've changed my
stance on this. Full blown coroutines are the proper way forward.
greenlet doesn't cut it because the Python interpreter isn't aware of
the context switches. Profiling, debugging and tracebacks are
completely broken by this. Stackless would need to be merged, and
that's clearly not going to happen.

I built a basic scheduler and had a go at "enhancing" the stdlib using
PEP380, here are some examples making use of this style:
https://bitbucket.org/anacrolix/green380/src/8f7fdc20a8ce/examples

After realising it was a dead-end, I read up on Mark's ideas, there's
some really good stuff in there:
http://www.dcs.gla.ac.uk/~marks/
http://hotpy.blogspot.com/

If someone can explain what's stopping real coroutines being into
Python (3.3), that would be great.
___
Python-Dev mailing list
Python-Dev@python.org
http://mail.python.org/mailman/listinfo/python-dev
Unsubscribe: 
http://mail.python.org/mailman/options/python-dev/archive%40mail-archive.com