On Fri, Aug 19, 2011 at 3:05 PM, Mark Wiebe <mwwi...@gmail.com> wrote: > On Fri, Aug 19, 2011 at 11:44 AM, Charles R Harris > <charlesr.har...@gmail.com> wrote: >> >> >> On Fri, Aug 19, 2011 at 12:37 PM, Bruce Southey <bsout...@gmail.com> >> wrote: >>> >>> Hi, >>> Just some immediate minor observations that are really about trying to >>> be consistent: >>> >>> 1) Could you keep the display of the NA dtype be the same as the array? >>> For example, NA dtype is displayed as '<f8' but should be displayed as >>> 'float64' as that is the array dtype. >>> >>> a=np.array([[1,2,3,np.NA], [3,4,np.nan,5]]) >>> >>> a >>> array([[ 1., 2., 3., NA], >>> [ 3., 4., nan, 5.]]) >>> >>> a.dtype >>> dtype('float64') >>> >>> a.sum() >>> NA(dtype='<f8') >>> >>> 2) Can the 'skipna' flag be added to the methods? >>> >>> a.sum(skipna=True) >>> Traceback (most recent call last): >>> File "<stdin>", line 1, in <module> >>> TypeError: 'skipna' is an invalid keyword argument for this function >>> >>> np.sum(a,skipna=True) >>> nan >>> >>> 3) Can the skipna flag be extended to exclude other non-finite cases like >>> NaN? >>> >>> 4) Assigning a np.NA needs a better error message but the Integer >>> array case is more informative: >>> >>> b=np.array([1,2,3,4], dtype=np.float128) >>> >>> b[0]=np.NA >>> Traceback (most recent call last): >>> File "<stdin>", line 1, in <module> >>> TypeError: float() argument must be a string or a number >>> >>> >>> j=np.array([1,2,3]) >>> >>> j >>> array([1, 2, 3]) >>> >>> j[0]=ina >>> Traceback (most recent call last): >>> File "<stdin>", line 1, in <module> >>> TypeError: int() argument must be a string or a number, not >>> 'numpy.NAType' >>> >>> But it is nice that np.NA 'adjusts' to the insertion array: >>> >>> b.flags.maskna = True >>> >>> ana >>> NA(dtype='<f8') >>> >>> b[0]=ana >>> >>> b[0] >>> NA(dtype='<f16') >>> >>> 5) Different display depending on masked state. That is I think that >>> 'maskna=True' should be displayed always when flags.maskna is True : >>> >>> j=np.array([1,2,3], dtype=np.int8) >>> >>> j >>> array([1, 2, 3], dtype=int8) >>> >>> j.flags.maskna=True >>> >>> j >>> array([1, 2, 3], maskna=True, dtype=int8) >>> >>> j[0]=np.NA >>> >>> j >>> array([NA, 2, 3], dtype=int8) # Ithink it should still display >>> 'maskna=True'. >>> >> >> My main peeve is that NA is upper case ;) I suppose that could use some >> discussion. > > There is some proliferation of cases in the NaN case: >>>> np.nan > nan >>>> np.NAN > nan >>>> np.NaN > nan > The pros I see for NA over na are: > * less confusion of NA vs nan (should this carry over to the np.isna > function, should it be np.isNA according to this point?) > * more comfortable for switching between NumPy and R when people have to use > both at the same time > The main con is: > * Inconsistent with current nan, inf printing. Here's a hackish workaround: >>>> np.na = np.NA >>>> np.set_printoptions(nastr='na') >>>> np.array([np.na, 2.0]) > array([na, 2.]) > What's your list of pros and cons? > -Mark > >> >> Chuck >>
In part I sort of like to have NA and nan since poor eyesight/typing/editing avoiding problems dropping the last 'n'. Regarding nan/NAN, do you mean something like my ticket 1051? http://projects.scipy.org/numpy/ticket/1051 I do not care that much about the case (mixed case is not good) provided that there is only one to specify these. Also should np.isfinite() return False for np.NA? >>> np.isfinite([1,2,np.NA,4]) array([ True, True, NA, True], dtype=bool) Anyhow, many thanks for the replies to my observations and your amazing effect in getting this done. Bruce _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion