On Wed, Mar 5, 2014 at 9:11 PM, Nathaniel Smith <[email protected]> wrote:
> On Mon, Mar 3, 2014 at 7:20 PM, Julian Taylor > <[email protected]> wrote: > > hi, > > > > as the numpy gsoc topic page is a little short on options I was thinking > > about adding two topics for interested students. But as I have no > > experience with gsoc or mentoring and the ideas are not very fleshed out > > yet I'd like to ask if it might make sense at all: > > > > 1. configurable algorithm precision > [...] > > with np.precmode(default="fast"): > > np.abs(complex_array) > > > > or fast everything except sum and hypot > > > > with np.precmode(default="fast", sum="kahan", hypot="standard"): > > np.sum(d) > [...] > > Not a big fan of this one -- it seems like the biggest bulk of the > effort would be in figuring out a non-horrible API for exposing these > things and getting consensus around it, which is not a good fit to the > SoC structure. > > I'm pretty nervous about the datetime proposal that's currently on the > wiki, for similar reasons -- I'm not sure it's actually doable in the > SoC context. > > > 2. vector math library integration > > This is a great suggestion -- clear scope, clear benefit. > > Two more ideas: > > 3. Using Cython in the numpy core > > The numpy core contains tons of complicated C code implementing > elaborate operations like indexing, casting, ufunc dispatch, etc. It > would be really nice if we could use Cython to write some of these > things. However, there is a practical problem: Cython assumes that > each .pyx file generates a single compiled module with its own > Cython-defined API. Numpy, however, contains a large number of .c > files which are all compiled together into a single module, with its > own home-brewed system for defining the public API. And we can't > rewrite the whole thing. So for this to be viable, we would need some > way to compile a bunch of .c *and .pyx* files together into a single > module, and allow the .c and .pyx files to call each other. This might > involve changes to Cython, some sort of clever post-processing or glue > code to get existing cython-generated source code to play nicely with > the rest of numpy, or something else. > > So this project would have the following goals, depending on how > practical this turns out to be: (1) produce a hacky proof-of-concept > system for doing the above, (2) turn the hacky proof-of-concept into > something actually viable for use in real life (possibly this would > require getting changes upstream into Cython, etc.), (3) use this > system to actually port some interesting numpy code into cython. > Having to synchronise two projects may be hard for a GSoC, no ? Otherwise, I am a bit worried about cython being used on the current C code as is, because core and python C API are so interwined (especially multiarray). Maybe one could use cython on the non-core numpy parts that are still in C ? It is not as sexy of a project, though. > > 4. Pythonic dtypes > > The current dtype system is klugey. It basically defines its own class > system, in parallel to Python's, and unsurprisingly, this new class > system is not as good. In particular, it has limitations around the > storage of instance-specific data which rule out a large variety of > interesting user-defined dtypes, and causes us to need some truly > nasty hacks to support the built-in dtypes we do have. And it makes > defining a new dtype much more complicated than defining a new Python > class. > > This project would be to implement a new dtype system for numpy, in > which np.dtype becomes a near-empty base class, different dtypes > (e.g., float64, float32) are simply different subclasses of np.dtype, > and dtype objects are simply instances of these classes. Further > enhancements would be to make it possible to define new dtypes in pure > Python by subclassing np.dtype and implementing special methods for > the various dtype operations, and to make it possible for ufunc loops > to see the dtype objects. > > This project would provide the key enabling piece for a wide variety > of interesting new features: missing value support, better handling of > strings and categorical data, unit handling, automatic > differentiation, and probably a bunch more I'm forgetting right now. > > If we get someone who's up to handling the dtype thing then I can > mentor or co-mentor. > > What do y'all think? > > (I don't think I have access to update that wiki page -- or maybe I'm > just not clever enough to figure out how -- so it would be helpful if > someone who can, could?) > > -- > Nathaniel J. Smith > Postdoctoral researcher - Informatics - University of Edinburgh > http://vorpus.org > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion >
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
