On 05/03/2012 06:27 AM, Travis Oliphant wrote: > > On May 2, 2012, at 10:03 PM, Stéfan van der Walt wrote: > >> On Wed, May 2, 2012 at 6:25 PM, Travis Oliphant<[email protected]> wrote: >>> The only new principle (which is not strictly new --- but new to NumPy's >>> world-view) is using one (or more) fields of a structured array as >>> "synthetic dimensions" which replace 1 or more of the raw table dimensions. >> >> Ah, thanks--that's the detail I was missing. I wonder if the >> contiguity requirement will hamper us here, though. E.g., I could >> imagine that some tree structure might be more suitable to storing and >> organizing indices, and for large arrays we wouldn't like to make a >> copy for each operation. I guess we can't wait for discontiguous >> arrays to come along, though :) > > Actually, it's better to keep the actual data together as much as possible, I > think, and simulate a tree structure with a layer on top --- i.e. an index. > > Different algorithms will prefer different orderings of the underlying data > just as today different algorithms prefer different striding patterns on the > standard, strided view of a dense array.
Two examples: 1) If you know the distribution of your indices in N dimensions ahead of time (e.g., roughly evenly particles in 3D space), you better fill that volume using a fractal, and use indices along that fractal, so that you get at least some spatial locality in all N dimensions. (This is standard procedure in parallelization codes) 2) For standard dense linear algebra code, it is better to store the data blocked than either contiguous ordering -- to the point that LAPACK/BLAS repacks to a blocked ordering internally on each operation. Dag _______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
