Hi all,
I've read some discussions about adding labeled axes, and even ticks, to numpy
arrays (such as in Luis' dataarray).
I have recently found that the ability to label axes would be very helpful to
me, but I'd like to keep the implementation as lightweight as possible.
The reason I would find this useful is because I am writing a ndarray subclass
that loads image/volume file formats into numpy arrays. Some of these files
might have multiple images/volumes, I'll call them channels, and also may have
an additional dimension for vectors associated with each pixel/voxel, like
color. The max dims of the array would then be 5.
Example: data = ndarray([1023,128,128,128,3]) might mean (channels,z,y,x,rgb)
for one array. Now I want to keep as much of the fancy indexing capabilities
of numpy as I can, but I am finding it difficult to track the removal of axes
that can occur from indexing. For example data[2,2] would return an array of
shape (128,128,3), or the third slice through the third volume in the dataset,
but the returned array has lost the meaning associated with its axes, so saving
it back out would require manual relabeling of the axes. I'd like to be able
to track the axes as metadata and retain all the fancy numpy indexing.
There are two ways I could accomplish this with minimal code on the python side:
One would be if indexing of the array always returned an array of the same
dimensionality, that is data[2,2] returned an array of shape (1,1,128,128,3).
I could then delete the degenerate axes labels from the metadata, and return
the compressed array, resulting in the same output:
class Data(np.ndarray):
def __getitem__(self,indices):
data = np.ndarray.__getitem__(self,indices,donotcompress=True)
# as an example
data.axeslabels = [label for label,dim in
zip(self.axeslabels,data.shape) if dim > 1]
return data.compress()
def __getslice__(self,s1,s2,step):
# trivial case
Another approach would be if there is some function in the numpy internals that
I could use to get the needed information before calling the ndarray's
__getitem__ function:
class Data(np.ndarray):
def __getitem__(self,indices):
unique = np.uniqueIndicesPerDimension(indices)
data = np.ndarray.__getitem__(self,indices)
data.axeslabels = [label for label,dim in zip(self.axeslabels,
unique) if dim > 1]
return data
Finally, I could implement my own parser for the passed indices to figure this
out myself. This would be bad since I'd have to recreate a lot of the same
code that must go on inside numpy, and it would be slower, error-prone, etc. :
class Data(np.ndarray):
def __getitem__(self,indices):
indices = self.uniqueDimensionIndices(indices)
data = np.ndarray.__getitem__(self,indices)
data.axeslabels = [label for label,dim in
zip(self.axeslabels,indices) if dim > 1]
return data
def uniqueDimensionIndices(self,indices):
if isinstance(indices,int):
indices = (indices,)
if isinstance(indices,tuple):
....
elif isinstance(indices,list):
...
Is there anything in the numpy internals already that would allow me to do #1
or #2?, I don't think #3 is a very good option.
Thanks!
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion