dicts,instances,containers, slotted instances, et cetera.

2009-01-28 Thread ocschwar
Hi, all.

I have an application that that creates, manipulates, and finally
archives on disk 10^6 instances of an object that in CS/DB terms is
best described as a relation.

It has 8 members, all of them common Python datatypes. 6 of these are
set once and then not modified. 2 are modified around 4 times before
the instance's archving. Large collections (of small lists) of these
objects are created, iterated through, and sorted using any and all of
the 8 members as sorting keys.

It neither has nor needs custom methods.

I used a simple dictionary to create the application prototype. Now I
need to speed things up.
I first tried changing to a new style class, with __slots__, __init__,
__getstate__& __setstate__ (for pickling) and was shocked to see
things SLOW down over dictionaries.

So of these options, where should I go first to satisfy my need for
speed?

0. Back to dict
1. old style class
2. new style class
3. new style class, with __slots__, with or without some nuance I'm
missing.
4. tuple, with constants to mark the indices
5. namedTuple
6. other...
--
http://mail.python.org/mailman/listinfo/python-list


Re: dicts,instances,containers, slotted instances, et cetera.

2009-01-28 Thread ocschwar
On Jan 28, 4:50 pm, Aaron Brady  wrote:
> On Jan 28, 2:38 pm, [email protected] wrote:
>
> Hello, quoting myself from another thread today:
>
> There is the 'shelve' module.  You could create a shelf that tells you
> the filename of the 5 other ones.  A million keys should be no
> problem, I guess.  (It's standard library.)  All your keys have to be
> strings, though, and all your values have to be pickleable.  If that's
> a problem, yes you will need ZODB or Django (I understand), or another
> relational DB.
>
> There is currently no way to store live objects.


The problem is NOT archiving these objects. That works fine.

It's the computations I'm using these thigns for that are slow, and
that failed to speed up using __slots__.

What I need is something that will speed up getattr() or its
equivalent, and to a lesser degree setattr() or its equivalent.
--
http://mail.python.org/mailman/listinfo/python-list


Re: dicts,instances,containers, slotted instances, et cetera.

2009-01-28 Thread ocschwar
On Jan 28, 5:21 pm, "Diez B. Roggisch"  wrote:
> [email protected] schrieb:
>
>
>
> > Hi, all.
>
> > I have an application that that creates, manipulates, and finally
> > archives on disk 10^6 instances of an object that in CS/DB terms is
> > best described as a relation.
>
> > It has 8 members, all of them common Python datatypes. 6 of these are
> > set once and then not modified. 2 are modified around 4 times before
> > the instance's archving. Large collections (of small lists) of these
> > objects are created, iterated through, and sorted using any and all of
> > the 8 members as sorting keys.
>
> > It neither has nor needs custom methods.
>
> > I used a simple dictionary to create the application prototype. Now I
> > need to speed things up.
> > I first tried changing to a new style class, with __slots__, __init__,
> > __getstate__& __setstate__ (for pickling) and was shocked to see
> > things SLOW down over dictionaries.
>
> > So of these options, where should I go first to satisfy my need for
> > speed?
>
> > 0. Back to dict
> > 1. old style class
> > 2. new style class
> > 3. new style class, with __slots__, with or without some nuance I'm
> > missing.
> > 4. tuple, with constants to mark the indices
> > 5. namedTuple
> > 6. other...
>
> Use a database? Or *maybe* a C-extension wrapped by ctypes.
>
> Diez

I can't port the entire app to be a stored database procedure.

ctypes, maybe. I just find it odd that there's no quick answer on the
fastest way in Python to implement a mapping in this context.
--
http://mail.python.org/mailman/listinfo/python-list