Regex for unicode letter characters

2009-01-10 Thread schickb
I need a regex that will match strings containing only unicode letter
characters (not including numeric or the _ character). I was surprised
to find the 're' module does not include a special character class for
this already (python 2.6). Or did I miss something?

It seems like this would be a very common need. Is the following the
only option to generate the character class (based on an old post by
Martin v. Löwis )?

import unicodedata, sys

def letters():
start = end = None
result = []
for index in xrange(sys.maxunicode + 1):
c = unichr(index)
if unicodedata.category(c)[0] == 'L':
if start is None:
start = end = c
else:
end = c
elif start:
if start == end:
result.append(start)
else:
result.append(start + "-" + end)
start = None
return u'[' + u''.join(result) + u']'

Seems rather cumbersome.

-Brad
--
http://mail.python.org/mailman/listinfo/python-list


new.instancemethod questions

2009-01-29 Thread schickb
I'd like to add bound functions to instances, and found the
instancemethod function in the new module. A few questions:

1. Why is instancemethod even needed? Its counter-intuitive (to me at
least) that assigning a function to a class results in bound functions
its instances, while assigning directly to instances does not create a
bound function. So why doesn't assigning a function to an instance
attribute result in a function bound to that instance?

2. The 2.6 docs say the new module is depreciated and refers to the
types module instead. But I haven't found a way to create bound
functions using the types module. Am I just missing something?

Thanks,
-Brad
--
http://mail.python.org/mailman/listinfo/python-list


Re: new.instancemethod questions

2009-01-29 Thread schickb
On Jan 29, 7:38 pm, Mel  wrote:
> schickb wrote:
> > I'd like to add bound functions to instances, and found the
> > instancemethod function in the new module. A few questions:
>
> > 1. Why is instancemethod even needed? Its counter-intuitive (to me at
> > least) that assigning a function to a class results in bound functions
> > its instances, while assigning directly to instances does not create a
> > bound function. So why doesn't assigning a function to an instance
> > attribute result in a function bound to that instance?
>
> If I understand you correctly, rebinding to the instance would break code
> like:
>
> myfakefile.write = sys.stdout.write
>
> where the intent would be to redirect any output through myfakefile straight
> to sys.stdout.  The code for the sys.stdout.write function would never find
> the attributes it needed in the instance of myfakefile.  To do this,
> methods have to stay bound to their proper instances.
>

1. I'm thinking about assigning free non-bound functions. Like:

class A(object):
   pass

def func(self):
   print repr(self)

a = A()
a.func = func  # Why doesn't this automatically create a bound
function (aka method)?


2. And what is the preferred way to do this if the "new" module and
its instancemethod function are depreciated?


-Brad
--
http://mail.python.org/mailman/listinfo/python-list


Re: new.instancemethod questions

2009-01-29 Thread schickb
On Jan 29, 8:51 pm, Brian Allen Vanderburg II
 wrote:
> You can also create a bound method and manually bind it to the
> instance.  This is easier
>
> import types
> a.f2 = types.MethodType(f1, a)
>
> a.f2() # prints object a

Ah thanks, that is what I was looking for. I missed that because
following types.MethodType in the docs is:

types.UnboundMethodType
An alternate name for MethodType

Which made me think it was a type for UnboundMethods (aka functions).
This:
>>> help(types.UnboundMethodType)

clears it up for me, but the docs are rather confusing.


> These may work for most uses, but both have a problem that happens if
> you need to make a copy of the instance.  When you copy it, the copies
> 'f1' will still call the function but using the old object
>
> a.f1() # prints object a
> b = copy.copy(a)
> b.f1() # still prints a
>

Ugh, that is a problem. I guess that means pickling won't work
either

Nope, "TypeError: can't pickle instancemethod objects". So does these
mean there is no way to create a method on an instance at runtime that
behaves just like a method that originated from the instance's class?

-Brad
--
http://mail.python.org/mailman/listinfo/python-list


Popen pipe hang

2008-05-12 Thread schickb
I'm trying to pipe data that starts life in an array('B') object
through several processes. The code below is a simplified example. The
data makes it through, but the wait() always hangs. Is there a better
way to indicate src.stdin has reach EOF?

from subprocess import Popen, PIPE
from array import array

arr = array('B')
arr.fromstring("hello\n")

src = Popen( ["cat"], stdin=PIPE, stdout=PIPE)
dst = Popen( ["cat"], stdin=src.stdout)
arr.tofile(src.stdin)
src.stdin.close()
dst.wait()
--
http://mail.python.org/mailman/listinfo/python-list


Re: Popen pipe hang

2008-05-12 Thread schickb
On May 12, 7:35 pm, Jean-Paul Calderone <[EMAIL PROTECTED]> wrote:
> >from subprocess import Popen, PIPE
> >from array import array
>
> >arr = array('B')
> >arr.fromstring("hello\n")
>
> >src = Popen( ["cat"], stdin=PIPE, stdout=PIPE)
> >dst = Popen( ["cat"], stdin=src.stdout)
> >arr.tofile(src.stdin)
> >src.stdin.close()
> >dst.wait()
>
> Alas, you haven't actually closed src's standard input.  Though you did
> close src.stdin, you didn't close the copy of it which was inherited by
> dst!  So though the file descriptor is no longer open in your main process,
> it remains open due to the reference dst has to it.  You can fix this by
> having Popen close all file descriptors except 0, 1, and 2 before it execs
> cat - pass close_fds=True to the 2nd Popen call and you should get the
> behavior you want.
>

Thanks, that did the trick. Although I assume that by passing
close_fds=True the second Popen is actually closing src.stdout (rather
than src.stdin as mentioned)? With close_fds=True, is Python buffering
the data even though bufsize=0 is the default? Otherwise I don't see
how it could close the fd before executing the process.
--
http://mail.python.org/mailman/listinfo/python-list


Sequence iterators with __index__

2008-06-24 Thread schickb
I think it would be useful if iterators on sequences had the __index__
method so that they could be used to slice sequences. I was writing a
class and wanted to return a list iterator to callers.  I then wanted
to let callers slice from an iterator's position, but that isn't
supported without creating a custom iterator class.

Are there reasons for not supporting this generally? I realize not all
iterators would have the __index__ method, but that seems ok.

In Python 3, maybe this could be called a SequenceIterator

-Brad
--
http://mail.python.org/mailman/listinfo/python-list


Re: Sequence iterators with __index__

2008-06-24 Thread schickb
On Jun 24, 3:45 pm, Matimus <[EMAIL PROTECTED]> wrote:
>
> > I think it would be useful if iterators on sequences had the __index__
> > method so that they could be used to slice sequences. I was writing a
> > class and wanted to return a list iterator to callers.  I then wanted
> > to let callers slice from an iterator's position, but that isn't
> > supported without creating a custom iterator class.
>
> Could you post an example of what you are talking about? I'm not
> getting it.

Interactive mock-up:

>>> a = ['x','y','z']
>>> it = iter(a)
>>> a[it:]
['x', 'y', 'z']
>>> it.next()
'x'
>>> a[it:]
['y', 'z']
>>> a[:it]
['x']
>>> it.next()
'y'
>>> a[it:]
['z']

This lets you use sequence iterators more general position indicators.
Currently if you want to track a position and slice from a tracked
position you must do it manually with an integer index. It's not
difficult, but given that sequence iterators already do that already
it seems redundant (and of course more error prone).


> In any case, the first step is writing a PEP.http://www.python.org/dev/peps/
>

Ok thanks, but I do want some idea of interest level before spending a
bunch of time on this.

-Brad
--
http://mail.python.org/mailman/listinfo/python-list


Re: Sequence iterators with __index__

2008-06-25 Thread schickb
On Jun 24, 5:46 pm, Terry Reedy <[EMAIL PROTECTED]> wrote:
>
> Wanting to slice while iterating is a *very* specialized usage.

I disagree because iterators mark positions, which for sequences are
just offsets. And slicing is all about offsets. Here is a quote from
the already implemented PEP 357:

"Currently integers and long integers play a special role in slicing
in that they are the only objects allowed in slice syntax. In other
words, if X is an object implementing the sequence protocol, then
X[obj1:obj2] is only valid if obj1 and obj2 are both integers or long
integers.  There is no way for obj1 and obj2 to tell Python that they
could be reasonably used as indexes into a sequence.  This is an
unnecessary limitation."

But this isn't just about slicing. I'd like sequence iterators to be
usable as simple indexes as well; like a[it] (which __index__ would
also provide).

> In any case:
> A. If the iterator uses in incrementing index to iterate, you want access.
> B. Using an iterator as an integer will strike most people as
> conceptually bizarre; it will never be accepted.

It's not meant to be used as an integer. It's meant to be used as a
position in the sequence, which iterators already are. The fact that
the position is represented as an integer is not that important
(except to python). I'll grant you that it is conceptually strange
that you could use an iterator on one sequence as an index into
another.

> C. Doing so is unnecessary since the internal index can just as easily
> be exposed as an integer attribute called 'index' or, more generally,
> 'count'.
> a[it.count:] looks *much* better.
> D. You can easily add .index or .count to any iterator you write.  The
> iterator protocol is a minimum rather than maximum specification.

Following that line of reasoning, the __index__ special method
shouldn't really exist at all. Your arguments would suggest that NumPy
shouldn't use __index__ either because:
a[ushort.index] "looks *much* better".

> E. You can easily wrap any iterable/iterator in an iterator class that
> provides .count for *any* iteration process.

Sure, and that is why I mentioned this in my original post. But the
idea is to avoid redundant code and data in the case of sequences, and
make it a standard feature.

> F. Even this should be unnecessary for most usages.  Built-in function
> enumerate(iterable) generates count,item pairs in much the same manner:

I am not aware of a way to get the current position out of an
enumerate object without advancing it (or creating a custom wrapper).
If the special __index__ method was added it might be interesting ;)
But iterators are already a clean and abstract position marker, and
for sequences it seems surprising to me that they can't really be used
as such.

-Brad
--
http://mail.python.org/mailman/listinfo/python-list


Re: Sequence iterators with __index__

2008-06-25 Thread schickb
On Jun 25, 12:11 am, schickb <[EMAIL PROTECTED]> wrote:
>
> But this isn't just about slicing. I'd like sequence iterators to be
> usable as simple indexes as well; like a[it] (which __index__ would
> also provide).

It occurred to me that this wouldn't need to be limited to sequence
iterators. Although somewhat of a misnomer, __index__ could just a
well return the current key for mapping iterators. Type checking would
then be specific to the context, rather than hard-coded to integer.
Perhaps __position__ would have been a better name.

The generalized idea here is that iterators identify positions in
collections, so why shouldn't they be usable where collections accept
such identifiers?

>>> m = {'a':1, 'b':2}
>>> it = iter(m)
>>> m[it]
1
>>> it.next()
>>> m[it]
2

These are trivial examples, but there are lots of uses for abstract
position indicators. From what I've seen, iterators are currently used
almost exclusively as temporary objects in loops. But perhaps if they
had a bit more functionality they could serve a wider purpose.

-Brad
--
http://mail.python.org/mailman/listinfo/python-list


Sequence splitting

2009-07-02 Thread schickb
I have fairly often found the need to split a sequence into two groups
based on a function result. Much like the existing filter function,
but returning a tuple of true, false sequences. In Python, something
like:

def split(seq, func=None):
if func is None:
func = bool
t, f = [], []
for item in seq:
if func(item):
t.append(item)
else:
f.append(item)
return (t, f)

The discussion linked to below has various approaches for doing this
now, but most traverse the sequence twice and many don't apply a
function to spit the sequence.
http://stackoverflow.com/questions/949098/python-split-a-list-based-on-a-condition

Is there any interest in a C implementation of this? Seems too trivial
to write a PEP, so I'm just trying to measure interest before diving
in. This wouldn't really belong in intertool. Would it be best
implemented as a top level built-in?

-Brad
-- 
http://mail.python.org/mailman/listinfo/python-list