On Mon, Sep 1, 2014 at 8:49 AM, Eelco Hoogendoorn <hoogendoorn.ee...@gmail.com> wrote: > Sure, id like to do the hashing things out, but I would also like some > preliminary feedback as to whether this is going in a direction anyone else > sees the point of, if it conflicts with other plans, and indeed if we can > agree that numpy is the right place for it; a point which I would very much > like to defend. If there is some obvious no-go that im missing, I can do > without the drudgery of writing proper documentation ;). > > As for whether this belongs in numpy: yes, I would say so. There are the > extension of functionality to functions already in numpy, which are a > no-brainer (it need not cost anything performance wise, and ive needed > unique graph edges many many times), and there is the grouping > functionality, which is the main novelty. > > However, note that the grouping functionality itself is a very small > addition, just a few 100 lines of pure python, given that the indexing logic > has been factored out of the classic arraysetops. At least from a developers > perspective, it very much feels like a logical extension of the same > 'thing'.
My 2 cents: I definitely agree that this is very useful fundamental functionality, and it would be great if numpy had a solution for it out of the box. My main concern is that this is a fairly complicated set of functionality and there are a lot of small decisions to be made in setting up the API for it. IME it's very hard to just read through an API like this and reason out the best way to do it by pure logic; usually it needs to get banged on for a bit in real uses before it becomes clear what the right set of trade-offs is. And numpy itself is not a great environment these kinds of iterations. So, IMO the main challenge is: how do we get the functionality into a state where we can convince ourselves that it'll be supportable in numpy indefinitely, and not need to be replaced in a year or two? Some things that might help with this convincing: - releasing it as a small standalone package on pypi and getting some real users to bang on it - any real code written against the APIs - feedback from the pandas community since they've spent a lot of time working on these issues - ...? -n -- Nathaniel J. Smith Postdoctoral researcher - Informatics - University of Edinburgh http://vorpus.org _______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion