Re: [Numpy-discussion] Fwd: Advanced selection, duplicate indices, and augmented assignment

A. M. Archibald Wed, 10 Jan 2007 20:18:15 -0800

On 10/01/07, Charles R Harris <[EMAIL PROTECTED]> wrote:

> You're the second or third person to recently come up with a useful
application for using inplace ops with repeated indices only to be stymied
by that fact that they don't actually work that way. I've been wondering if
we should try to come up with a way to support this usage. Although it might
be feasible to make += work, the implementation is horrible, and there are
all sorts of inefficiencies built into that method of doing it.
>
> The natural approach would seem to be to hang another method off the
ufuncs. That way you get support for all of the various operations. I was
thinking something like:
>
>
> unfunc.inplace (dest, indices, source)

Yeah, something like that. A general indirect addressing primitive of some
sort would be useful for these situations. The indices are sort of pointers
into the destination, the problem is how to bind this to operators. It
should almost be some sort lvalue analogous to **p = whatever, or in this
case dest[index[j]] {assignment} src[j]. So maybe indirect(dest, indices,
source, op). Then there is all the madness of fancy indexing (which I think
I could have done without). Because the current applications are limited and
could be met with the simplest 1D version, it would probably be best to
start with the most basic functionality and then extend it if the need
arose. Or maybe even some simple c extension that did the whole basic +=
thing for one dimension.


I disagree that it should be hung off ufuncs. That doesn't allow, for
example, averaging of normal vectors (without some highly clever and
hard-to-follow axis specification).

I implemented one possible version in pure python, just to think about
the UI (and to see if it could be made fast using sorting; I don't
think it will actually help much, and I also note that whether it even
has any hope depends on whether there are more objects to accumulate
than destination slots, or fewer). The basic design is

def accumulate(dest, indices, values, combine=N.add);

indices should be a one-dimensional list of indices, though since dest
can be an n-dimensional array of r-dimensional values, it's actually a
two-dimensional array; values is a list of r-dimensional values, and
combine should take two r-dimensional values and yield another.

The reason for the r-dimensional values is to allow, for example,
averaging of normals, or combination in some other way of colours, or
summing of vectors (my original application).

This implementation does not allow multidimensional arrays of indices,
partly because it would be cumbersome to iterate over them, partly
because a reshape (and possible copy) is not terribly offensive, and
partly because it would make the calling convention (more) confusing.

I am not seriously suggesting this implementation for use, but for
consideration; if people find the primitive does what they want, one
could reasonably simply implement it in C.

A. M. Archibald

import numpy as N

def accumulate(dest, indices, values, combine=N.add):
    """Accumulate values based on an index array

    dest should be an n+1-dimensional array, whose last dimension is l
    indices should be a 2-dimensional array, of size m by n, 
    values should be a 1-dimensional array, of size m by l, 
    combine should be a function for combining two vectors of length l to give a third

    Iterate over values, doing (approximately)
	dest[tuple(indices[i,:]),:]=combine(dest[tuple(indices[i,:]),:],values[i,:])

    Currently inefficient.
    """

    if N.shape(dest)[-1]!=N.shape(values)[-1]:
	raise ValueError, "Incompatible element lengths"
    (m,n) = N.shape(indices)
    (m2,l) = N.shape(values)
    if m2!=m:
	raise ValueError, "Indices and value are not compatible"
    if n!=len(N.shape(dest))-1:
	# FIXME: consider adding a new axis if l=1?
	raise ValueError, "Incorrect number of indices for dest"


    for i in xrange(m):
	t = tuple(indices[i,:])+(slice(None),)
	dest[t]=combine(dest[t],values[i])

    
# for use in speeding things up
def arglexsort(a):
    """argsort a lexicographically

    a should be two-dimensional array; it is sorted along the first axis
    """
    a = N.asarray(a)
    (m,n) = N.shape(a)
    perm = N.arange(m)
    for i in range(n)[::-1]:
	values = a[perm,i]
	s = N.argsort(values)
	print perm, s
	perm = perm[s]

    return perm

_______________________________________________
Numpy-discussion mailing list
Numpy-discussion@scipy.org
http://projects.scipy.org/mailman/listinfo/numpy-discussion

Re: [Numpy-discussion] Fwd: Advanced selection, duplicate indices, and augmented assignment

Reply via email to