Regex for unicode letter characters
I need a regex that will match strings containing only unicode letter characters (not including numeric or the _ character). I was surprised to find the 're' module does not include a special character class for this already (python 2.6). Or did I miss something? It seems like this would be a very common need. Is the following the only option to generate the character class (based on an old post by Martin v. Löwis )? import unicodedata, sys def letters(): start = end = None result = [] for index in xrange(sys.maxunicode + 1): c = unichr(index) if unicodedata.category(c)[0] == 'L': if start is None: start = end = c else: end = c elif start: if start == end: result.append(start) else: result.append(start + "-" + end) start = None return u'[' + u''.join(result) + u']' Seems rather cumbersome. -Brad -- http://mail.python.org/mailman/listinfo/python-list
new.instancemethod questions
I'd like to add bound functions to instances, and found the instancemethod function in the new module. A few questions: 1. Why is instancemethod even needed? Its counter-intuitive (to me at least) that assigning a function to a class results in bound functions its instances, while assigning directly to instances does not create a bound function. So why doesn't assigning a function to an instance attribute result in a function bound to that instance? 2. The 2.6 docs say the new module is depreciated and refers to the types module instead. But I haven't found a way to create bound functions using the types module. Am I just missing something? Thanks, -Brad -- http://mail.python.org/mailman/listinfo/python-list
Re: new.instancemethod questions
On Jan 29, 7:38 pm, Mel wrote: > schickb wrote: > > I'd like to add bound functions to instances, and found the > > instancemethod function in the new module. A few questions: > > > 1. Why is instancemethod even needed? Its counter-intuitive (to me at > > least) that assigning a function to a class results in bound functions > > its instances, while assigning directly to instances does not create a > > bound function. So why doesn't assigning a function to an instance > > attribute result in a function bound to that instance? > > If I understand you correctly, rebinding to the instance would break code > like: > > myfakefile.write = sys.stdout.write > > where the intent would be to redirect any output through myfakefile straight > to sys.stdout. The code for the sys.stdout.write function would never find > the attributes it needed in the instance of myfakefile. To do this, > methods have to stay bound to their proper instances. > 1. I'm thinking about assigning free non-bound functions. Like: class A(object): pass def func(self): print repr(self) a = A() a.func = func # Why doesn't this automatically create a bound function (aka method)? 2. And what is the preferred way to do this if the "new" module and its instancemethod function are depreciated? -Brad -- http://mail.python.org/mailman/listinfo/python-list
Re: new.instancemethod questions
On Jan 29, 8:51 pm, Brian Allen Vanderburg II wrote: > You can also create a bound method and manually bind it to the > instance. This is easier > > import types > a.f2 = types.MethodType(f1, a) > > a.f2() # prints object a Ah thanks, that is what I was looking for. I missed that because following types.MethodType in the docs is: types.UnboundMethodType An alternate name for MethodType Which made me think it was a type for UnboundMethods (aka functions). This: >>> help(types.UnboundMethodType) clears it up for me, but the docs are rather confusing. > These may work for most uses, but both have a problem that happens if > you need to make a copy of the instance. When you copy it, the copies > 'f1' will still call the function but using the old object > > a.f1() # prints object a > b = copy.copy(a) > b.f1() # still prints a > Ugh, that is a problem. I guess that means pickling won't work either Nope, "TypeError: can't pickle instancemethod objects". So does these mean there is no way to create a method on an instance at runtime that behaves just like a method that originated from the instance's class? -Brad -- http://mail.python.org/mailman/listinfo/python-list
Popen pipe hang
I'm trying to pipe data that starts life in an array('B') object
through several processes. The code below is a simplified example. The
data makes it through, but the wait() always hangs. Is there a better
way to indicate src.stdin has reach EOF?
from subprocess import Popen, PIPE
from array import array
arr = array('B')
arr.fromstring("hello\n")
src = Popen( ["cat"], stdin=PIPE, stdout=PIPE)
dst = Popen( ["cat"], stdin=src.stdout)
arr.tofile(src.stdin)
src.stdin.close()
dst.wait()
--
http://mail.python.org/mailman/listinfo/python-list
Re: Popen pipe hang
On May 12, 7:35 pm, Jean-Paul Calderone <[EMAIL PROTECTED]> wrote:
> >from subprocess import Popen, PIPE
> >from array import array
>
> >arr = array('B')
> >arr.fromstring("hello\n")
>
> >src = Popen( ["cat"], stdin=PIPE, stdout=PIPE)
> >dst = Popen( ["cat"], stdin=src.stdout)
> >arr.tofile(src.stdin)
> >src.stdin.close()
> >dst.wait()
>
> Alas, you haven't actually closed src's standard input. Though you did
> close src.stdin, you didn't close the copy of it which was inherited by
> dst! So though the file descriptor is no longer open in your main process,
> it remains open due to the reference dst has to it. You can fix this by
> having Popen close all file descriptors except 0, 1, and 2 before it execs
> cat - pass close_fds=True to the 2nd Popen call and you should get the
> behavior you want.
>
Thanks, that did the trick. Although I assume that by passing
close_fds=True the second Popen is actually closing src.stdout (rather
than src.stdin as mentioned)? With close_fds=True, is Python buffering
the data even though bufsize=0 is the default? Otherwise I don't see
how it could close the fd before executing the process.
--
http://mail.python.org/mailman/listinfo/python-list
Sequence iterators with __index__
I think it would be useful if iterators on sequences had the __index__ method so that they could be used to slice sequences. I was writing a class and wanted to return a list iterator to callers. I then wanted to let callers slice from an iterator's position, but that isn't supported without creating a custom iterator class. Are there reasons for not supporting this generally? I realize not all iterators would have the __index__ method, but that seems ok. In Python 3, maybe this could be called a SequenceIterator -Brad -- http://mail.python.org/mailman/listinfo/python-list
Re: Sequence iterators with __index__
On Jun 24, 3:45 pm, Matimus <[EMAIL PROTECTED]> wrote: > > > I think it would be useful if iterators on sequences had the __index__ > > method so that they could be used to slice sequences. I was writing a > > class and wanted to return a list iterator to callers. I then wanted > > to let callers slice from an iterator's position, but that isn't > > supported without creating a custom iterator class. > > Could you post an example of what you are talking about? I'm not > getting it. Interactive mock-up: >>> a = ['x','y','z'] >>> it = iter(a) >>> a[it:] ['x', 'y', 'z'] >>> it.next() 'x' >>> a[it:] ['y', 'z'] >>> a[:it] ['x'] >>> it.next() 'y' >>> a[it:] ['z'] This lets you use sequence iterators more general position indicators. Currently if you want to track a position and slice from a tracked position you must do it manually with an integer index. It's not difficult, but given that sequence iterators already do that already it seems redundant (and of course more error prone). > In any case, the first step is writing a PEP.http://www.python.org/dev/peps/ > Ok thanks, but I do want some idea of interest level before spending a bunch of time on this. -Brad -- http://mail.python.org/mailman/listinfo/python-list
Re: Sequence iterators with __index__
On Jun 24, 5:46 pm, Terry Reedy <[EMAIL PROTECTED]> wrote: > > Wanting to slice while iterating is a *very* specialized usage. I disagree because iterators mark positions, which for sequences are just offsets. And slicing is all about offsets. Here is a quote from the already implemented PEP 357: "Currently integers and long integers play a special role in slicing in that they are the only objects allowed in slice syntax. In other words, if X is an object implementing the sequence protocol, then X[obj1:obj2] is only valid if obj1 and obj2 are both integers or long integers. There is no way for obj1 and obj2 to tell Python that they could be reasonably used as indexes into a sequence. This is an unnecessary limitation." But this isn't just about slicing. I'd like sequence iterators to be usable as simple indexes as well; like a[it] (which __index__ would also provide). > In any case: > A. If the iterator uses in incrementing index to iterate, you want access. > B. Using an iterator as an integer will strike most people as > conceptually bizarre; it will never be accepted. It's not meant to be used as an integer. It's meant to be used as a position in the sequence, which iterators already are. The fact that the position is represented as an integer is not that important (except to python). I'll grant you that it is conceptually strange that you could use an iterator on one sequence as an index into another. > C. Doing so is unnecessary since the internal index can just as easily > be exposed as an integer attribute called 'index' or, more generally, > 'count'. > a[it.count:] looks *much* better. > D. You can easily add .index or .count to any iterator you write. The > iterator protocol is a minimum rather than maximum specification. Following that line of reasoning, the __index__ special method shouldn't really exist at all. Your arguments would suggest that NumPy shouldn't use __index__ either because: a[ushort.index] "looks *much* better". > E. You can easily wrap any iterable/iterator in an iterator class that > provides .count for *any* iteration process. Sure, and that is why I mentioned this in my original post. But the idea is to avoid redundant code and data in the case of sequences, and make it a standard feature. > F. Even this should be unnecessary for most usages. Built-in function > enumerate(iterable) generates count,item pairs in much the same manner: I am not aware of a way to get the current position out of an enumerate object without advancing it (or creating a custom wrapper). If the special __index__ method was added it might be interesting ;) But iterators are already a clean and abstract position marker, and for sequences it seems surprising to me that they can't really be used as such. -Brad -- http://mail.python.org/mailman/listinfo/python-list
Re: Sequence iterators with __index__
On Jun 25, 12:11 am, schickb <[EMAIL PROTECTED]> wrote:
>
> But this isn't just about slicing. I'd like sequence iterators to be
> usable as simple indexes as well; like a[it] (which __index__ would
> also provide).
It occurred to me that this wouldn't need to be limited to sequence
iterators. Although somewhat of a misnomer, __index__ could just a
well return the current key for mapping iterators. Type checking would
then be specific to the context, rather than hard-coded to integer.
Perhaps __position__ would have been a better name.
The generalized idea here is that iterators identify positions in
collections, so why shouldn't they be usable where collections accept
such identifiers?
>>> m = {'a':1, 'b':2}
>>> it = iter(m)
>>> m[it]
1
>>> it.next()
>>> m[it]
2
These are trivial examples, but there are lots of uses for abstract
position indicators. From what I've seen, iterators are currently used
almost exclusively as temporary objects in loops. But perhaps if they
had a bit more functionality they could serve a wider purpose.
-Brad
--
http://mail.python.org/mailman/listinfo/python-list
Sequence splitting
I have fairly often found the need to split a sequence into two groups based on a function result. Much like the existing filter function, but returning a tuple of true, false sequences. In Python, something like: def split(seq, func=None): if func is None: func = bool t, f = [], [] for item in seq: if func(item): t.append(item) else: f.append(item) return (t, f) The discussion linked to below has various approaches for doing this now, but most traverse the sequence twice and many don't apply a function to spit the sequence. http://stackoverflow.com/questions/949098/python-split-a-list-based-on-a-condition Is there any interest in a C implementation of this? Seems too trivial to write a PEP, so I'm just trying to measure interest before diving in. This wouldn't really belong in intertool. Would it be best implemented as a top level built-in? -Brad -- http://mail.python.org/mailman/listinfo/python-list
