Hi Ralf, > So I think the relevant choices are: > 1. Change nothing to the current status quo (and possibly direct end users > who need more than > what we offer now to `marray`) > 2. Add a keyword to reductions > 3. Add a single factory function that turns regular reductions into nan-aware > ones (as in > https://github.com/data-apis/array-api/issues/621#issuecomment-1553481118) > > I think (1) is also a very reasonable outcome if we don't like any of the > alternatives.
I am fine with (1), continue to dislike (2), and like (3). On (1) [status quo], you mentioned that nanptp was rejected earlier as a new addition to nanfunctions. If this was because we didn't want to expand the main numpy namespace (reasonable!), might a sub-option be to allow expansion in nanfunctions for any regular function in the numpy namespace, but only expose them in nanfunctions itself? An advantage would be that, effectively, those who like to omit NaN could just do "import numpy.lib.nanfunctions as np". Of course, at that point perhaps one should just bite the bullet and move nanfunctions out to its own package... On (2) [keyword argument], I continue to dislike the idea of adding new keyword arguments for the ufunc reductions -- ufuncs are one of the few bits of numpy API that are really nicely clean and consistent between many functions. We have been very careful about extending it, and keeping it light. They already allow `np.sum(data, where=~isnan(data)`, it is not obvious why we would add another option to do the same thing. Obviously, one could argue that np.sum != np.add.reduce, so their signatures can diverge, but I'd personally like to move in the opposite direction (if only for speed for small arrays). On (3) [factory function], I think a side benefit is that it is the lightest possible way to make useful what is required anyway, creating wrappers/implementations for functions not yet covered by nanfunctions. My suggestion of a nan-as-omit Array API compatible wrapper class would need them, and so would extending nanfunctions to cover more cases. Indeed, it would even help the keyword-argument case as it would provide working implementations. Let me also mention again another option, of a wrapper data type which translates floats with NaN to a floats with nan replaced by an appropriate constant (identify from reductions by default). To opt in, one would do something like, function(array.astype(NaNOmittingFloat), ...) But really one could initialize arrays like that and just keep working with them. Of course, this would rely completely on Sebastian's custom dtype mechanism, which has already proven its worth in StringDType, but which would likely not be recognized by other array classes. For that, a custom array class would be best (though given marray that may actually not be much work at all -- just need to have the mask always inferred instead of kept as a separate array). All the best, Marten p.s. I liked the little summary of what other languages do in https://github.com/data-apis/array-api/issues/621#issuecomment-1569485778 Julia's seemed a nice functional approach -- it seems a very interesting language in general, from which it is probably worth getting more ideas... _______________________________________________ NumPy-Discussion mailing list -- [email protected] To unsubscribe send an email to [email protected] https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: [email protected]
