On 2014-11-04 19:44, Charles R Harris wrote: > On Tue, Nov 4, 2014 at 11:19 AM, Sebastian <se...@sebix.at> wrote: > >> On 2014-11-04 15:06, Todd wrote: >>> On Tue, Nov 4, 2014 at 2:50 PM, Sebastian Wagner <se...@sebix.at >> >>> <mailto:se...@sebix.at>> wrote: >>> >>> Hello, >>> >>> I want to bring up Issue #2522 'numpy.diff fails on unsigned >> integers >>> (Trac #1929)' [1], as it was resonsible for an error in one >> of our >>> programs. Short explanation of the bug: np.diff performs a >> subtraction >>> on the input array. If this is of type uint and the data >> contains >>> falling data, it results in an artihmetic underflow. >>> >>> >>> np.diff(np.array([0,1,0], dtype=np.uint8)) >>> array([ 1, 255], dtype=uint8) >>> >>> @charris proposed either >>> - a note to the doc string and maybe an example to clarify >> things >>> - or raise a warning >>> but with a discussion on the list. >>> >>> I would like to start it now, as it is an error which is not >> easily >>> detectable (no errors or warnings are thrown). In our case >> the >>> type of a >>> data sequence, with only zeros and ones, had type f8 as also >> every >>> other >>> one, has been changed to u4. As the programs looked for >> values ==1 and >>> ==-1, it broke silently. >>> In my opinion, a note in the docs is not enough and does not >> help >>> if the >>> type changed or set after the program has been written. >>> I'd go for automatic upcasting of uints by default and an >> option >>> to turn >>> it off, if this behavior is explicitly wanted. This wouldn't >> be >>> correct >>> from the point of view of a programmer, but as most of the >> users >>> have a >>> scientific background who excpect it 'to work', instead of >> sth is >>> theoretically correct but not convenient. (I count myself to >> the first >>> group) >>> >>> >>> >>> When you say "automatic upcasting", that would be, for example >> uint8 >>> to int16? What about for uint64? There is no int128. >> The upcast should go to the next bigger, otherwise it would again >> result >> in wrong values. uint64 we can't do that, so it has to stay. >>> Also, when you say "by default", is this only when an overflow is >>> detected, or always? >> I don't know how I could detect an overflow in the diff-function. >> In >> subtraction it should be possible, but that's very deep in the >> numpy-internals. >>> How would the option to turn it off be implemented? An argument >> to >>> np.diff or some sort of global option? >> I thought of a parameter upcast_int=True for the function. > > Could check for non-decreasing sequence in the unsigned case. Note > that differences of signed integers can also overflow. One way to > check in general is to determine the expected sign using comparisons.
I think you mean a decreasing/non-increasing instead of non-decreasing sequence? It's also the same check as checking for a sorted sequence. But I currently don't know how I could do that efficiently without np.diff in Python, in Cython it should be easily possible. np.gradient has the same problem: >>> np.random.seed(89) >>> d = np.random.randint(0,2,size=10).astype(np.uint8); d array([1, 0, 0, 1, 0, 1, 1, 0, 0, 0], dtype=uint8) >>> np.diff(d) array([255, 0, 1, 255, 1, 0, 255, 0, 0], dtype=uint8) >>> np.gradient(d) array([ 255. , 127.5, 0.5, 0. , 0. , 0.5, 127.5, 127.5, 0. , 0. ]) --- gpg --keyserver keys.gnupg.net --recv-key DC9B463B _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion