Robert Kern wrote: > Tim Hochberg wrote: >> Robert Kern wrote: > >>> One possibility is to check if the object is an ndarray (or subclass) and >>> use >>> .copy() if so; otherwise, use the current implementation and hope that you >>> didn't pass it a Numeric or numarray array (or some other view-based >>> object). >>> >> I think I would invert this test and instead check if the object is a >> Python list and *not* copy in that case. Otherwise, use copy.copy to >> copy the object whatever it is. This looks like it would be more robust >> in that it would work in all sensible case, and just be a tad slower in >> some of them. > > I don't want to assume that the only two sequence types are lists and arrays. > The problem with using copy.copy() on non-arrays is that it, well, makes > copies > of the elements. The objects in the shuffled sequence are not the same objects > before and after the shuffling. I consider that to be a violation of the spec. > > Views are rare outside of numpy/Numeric/numarray, partially because Guido > considers them to be evil. I'm beginning to see why. > >> Another possible refinement / complication would be to special case 1D >> arrays so that they run fastish. >> >> A third possibility involves rewriting this in this form: >> >> indices = arange(len(x)) >> _shuffle_core(indices) # This just does what current shuffle now does >> x[:] = take(x, indices, 0) > > That's problematic since the elements all turn into numpy scalar objects: > > In [1]: from numpy import * > > In [2]: a = range(9,-1,-1) > > In [3]: idx = arange(len(a)) > > In [4]: a[:] = take(a, idx, 0) > > In [5]: a > Out[5]: [9, 8, 7, 6, 5, 4, 3, 2, 1, 0] > > In [6]: type(a[0]) > Out[6]: <type 'numpy.int32'> >
a[:]=take(asarray(a,object),idx,0) ? works also correct with ndarray's even if I didn't dig the reason why... all element will be probably re-casted twice. Think the take-method on shuffled indizes is basically right and natural for a numpy-shuffler. The example is just possibly another vote against the default behavior of letting numpy.scalar types out of arrays, which are set up with a "harmless" type. >>> array([1,2,3],float) array([ 1., 2., 3.]) >>> type(_[0]) <type 'numpy.float64'> >>> is just ill as it I think. In (Guido's) Python objects should probably come out of collections best as typy as they went in. Currently numpy-scalars will just "infect" the whole app almost like a virus (and kill performance and pickle's etc.) Of course views are essential for an efficient array type, but type-altering possibly not. For rare cases for generalized algs (I need to think hard to find even an example), where the array-interface is needed on elements (and a array(obj) cast is too uncomfortable), there could be still the different possibilty: >>> array([1,2,3],numpy.float64) then its natural that numpy.float64, numpy.int32.... come out, as the programmer would even expect it so. Thus maybe for array types: * float!=numpy.float64 (but common base class (or 'float' itself) maybe) * int !=numpy.intXX * complex !=numpy.complex128 * default array type is (python.)float * default array type from list of ints is (python.)int * default array type from list of complex is (python.)complex * default array type of other lists is always <object> currently this is also problematic: >>> array([1,2,"3",[]]) array(['1', '2', '3', '[]'], dtype='|S4') and even >>> array([1,2,"3ef",'wefwfewoiwjefo iwjef']) array(['1', '2', '3ef', 'wefwfewoiwjefo iwjef'], dtype='|S20') >>> _[0]='woeifjwo woie pwioef wliuefh lwieufh wleifuh welfiu ' >>> _ array(['woeifjwo woie pwioef', '2', '3ef', 'wefwfewoiwjefo iwjef'], dtype='|S20') is rarely what a Pythoneer would expect. Guess fix string arrays should only be created explicitely Robert _______________________________________________ Numpy-discussion mailing list Numpy-discussion@scipy.org http://projects.scipy.org/mailman/listinfo/numpy-discussion