Re: count secton of data in list

2009-02-23 Thread S Arrowsmith
In article <3ed253bb-d6ec-4f47-af08-ad193e9c4...@h16g2000yqj.googlegroups.com>,
odeits   wrote:
>def count_consecutive(rows):
>switch =3D 0
>count =3D 0
>for r in rows:
>if r[-1] =3D=3D switch:
>count +=3D 1
>else:
>switch =3D not switch
>if count !=3D 0:
>yield count
>count =3D 0
>if count !=3D 0:
>yield count
>
>rows = [ ... ]
>
>for cnt in count_consecutive(rows):
>print cnt

import itertools, operator

for k, g in itertools.groupby(rows, operator.itemgetter(3):
print len(list(g))

-- 
\S

   under construction

--
http://mail.python.org/mailman/listinfo/python-list


Re: String Identity Test

2009-03-04 Thread S Arrowsmith
Avetis KAZARIAN   wrote:
>It seems that any strict ASCII alpha-numeric string is instantiated as
>an unique object, like a "singleton" ( a =3D "x" and b =3D "x" =3D> a is b =
>)
>and that any non strict ASCII alpha-numeric string is instantiated as
>a new object every time with a new id.

What no-one appears to have mentioned so far is that the purpose
of this implementation detail is to ensure that there is a single
instance of strings which are valid identifiers, so that you don't
go around creating and destroying string instances just to do an
attribute look-up on an object. A few strings which are not valid
as identifiers get swept up into this system:

>>> a = "1"
>>> b = "1"
>>> a is b
True

"Small" integers get a similar treatment:

>>> a = 256
>>> b = 256
>>> a is b
True
>>> a = 257
>>> b = 257
>>> a is b
False

But as as hopefully been made clear, all this is completely an
implementation detail. (Indeed, the range of "interned" integers
changed from 0--99 to -5--2356 a few versions ago.) So don't,
under any circumstances, rely on it, even when you understand
what's going on.

-- 
\S

   under construction

--
http://mail.python.org/mailman/listinfo/python-list


Re: speeding up reading files (possibly with cython)

2009-03-09 Thread S Arrowsmith
Carl Banks   wrote:
>When building a very large structure like you're doing, the cyclic
>garbage collector can be a bottleneck.  Try disabling the cyclic
>garbage collector before building the large dictionary, and re-
>enabling it afterwards.
>
>import gc
>gc.disable()
>try:
>for line in file:
>split_values =3D line.strip().split('\t')
># do stuff with split_values
>finally:
>gc.enable()

Completely untested, but if you find yourself doing that a lot,
might:

import gc
from contextlib import contextmanager

@contextmanager
def no_gc():
gc.disable()
yield
gc.enable()

with no_gc():
 for line in file:
 # ... etc.

be worth considering?

-- 
\S

   under construction

--
http://mail.python.org/mailman/listinfo/python-list


Re: Why is lambda allowed as a key in a dict?

2009-03-10 Thread S Arrowsmith
Iain King   wrote:
>Sort of tangenitally; is there any real difference between the outcome
>of the two following pieces of code?
>
>a = lambda x: x+2
>
>def a(x):
>return x+2

a.__name__

As for why that matters, try a(None) and see which gives the more
informative traceback.

-- 
\S

   under construction

--
http://mail.python.org/mailman/listinfo/python-list


Re: String to sequence

2009-03-16 Thread S Arrowsmith
Peter Otten  <[email protected]> wrote:
>assert s.startswith("[")
>assert s.endswith("]")
>s = s[1:-1]

s.strip('[]')

(I suppose it all depends on how much you can trust the consistency
of the input.)

-- 
\S

   under construction

--
http://mail.python.org/mailman/listinfo/python-list


Re: Creating 50K text files in python

2009-03-18 Thread S Arrowsmith
[email protected]  wrote:
>FName = "TextFile"+c+"_"+d+"_"+p+".txt"
>l = 1
>for l in range(1 , 11):
>os.system ("\"echo "+FName+" >> "+FName+"\"")
>l = l +1

1. os.system spawns a new process, which on Windows (I'm guessing
you're on Windows given the full path names you've given) is a
particularly expensive operation.

2. You really do have a problem with loops. Not how many your
task has landed you, but the way you're writing them. You
don't need to pre-assign the loop variable, you don't need to
increment it, and it's generally clearer if you use a 0-based
range and add 1 when you use the variable if you need it to be
1-based.

So rewrite the above as:

fname = "TextFile%s_%s_%s.txt" % (c, d, p)
f = open(fname, 'a')
for l in range(10):
f.write(fname)
f.close()

Also:

3. All the logging with m.write is likely to be slowing things
down. It might be useful to see just how much of an impact that's
having.

4. Lose the globals and pass them as function arguments.
-- 
\S

   under construction

--
http://mail.python.org/mailman/listinfo/python-list


Re: How to do this in Python? - A "gotcha"

2009-03-18 Thread S Arrowsmith
Jim Garrison   wrote:
>It's a shame the iter(o,sentinel) builtin does the
>comparison itself, instead of being defined as iter(callable,callable)
>where the second argument implements the termination test and returns a
>boolean.  This would seem to add much more generality... is
>it worthy of a PEP?

class sentinel:
def __eq__(self, other):
return termination_test()

for x in iter(callable, sentinel()):
...

Writing a sensible sentinel.__init__ is left as an exercise

-- 
\S

   under construction

--
http://mail.python.org/mailman/listinfo/python-list