On Thu, Jun 5, 2014 at 2:51 PM, Charles R Harris <[email protected]> wrote:
> > > > On Thu, Jun 5, 2014 at 6:40 AM, David Cournapeau <[email protected]> > wrote: > >> >> >> >> On Thu, Jun 5, 2014 at 3:36 AM, Charles R Harris < >> [email protected]> wrote: >> >>> >>> >>> >>> On Wed, Jun 4, 2014 at 7:29 PM, Travis Oliphant <[email protected]> >>> wrote: >>> >>>> Believe me, I'm all for incremental changes if it is actually possible >>>> and doesn't actually cost more. It's also why I've been silent until now >>>> about anything we are doing being a candidate for a NumPy 2.0. I >>>> understand the challenges of getting people to change. But, features and >>>> solid improvements *will* get people to change --- especially if their new >>>> library can be used along with the old library and the transition can be >>>> done gradually. Python 3's struggle is the lack of features. >>>> >>>> At some point there *will* be a NumPy 2.0. What features go into >>>> NumPy 2.0, how much backward compatibility is provided, and how much >>>> porting is needed to move your code from NumPy 1.X to NumPy 2.X is the real >>>> user question --- not whether it is characterized as "incremental" change >>>> or "re-write". What I call a re-write and what you call an >>>> "incremental-change" are two points on a spectrum and likely overlap >>>> signficantly if we really compared what we are thinking about. >>>> >>>> One huge benefit that came out of the numeric / numarray / numpy >>>> transition that we mustn't forget about was actually the extended buffer >>>> protocol and memory view objects. This really does allow multiple array >>>> objects to co-exist and libraries to use the object that they prefer in a >>>> way that did not exist when Numarray / numeric / numpy came out. So, we >>>> shouldn't be afraid of that world. The existence of easy package managers >>>> to update environments to try out new features and have applications on a >>>> single system that use multiple versions of the same library is also >>>> something that didn't exist before and that will make any transition easier >>>> for users. >>>> >>>> One thing I regret about my working on NumPy originally is that I >>>> didn't have the foresight, skill, and understanding to work more on a more >>>> extended and better designed multiple-dispatch system so that multiple >>>> array objects could participate together in an expression flow. The >>>> __numpy_ufunc__ mechanism gives enough capability in that direction that it >>>> may be better now. >>>> >>>> Ultimately, I don't disagree that NumPy can continue to exist in >>>> "incremental" change mode ( though if you are swapping out whole swaths of >>>> C-code for Cython code --- it sounds a lot like a "re-write") as long as >>>> there are people willing to put the effort into changing it. I think this >>>> is actually benefited by the existence of other array objects that are >>>> pushing the feature envelope without the constraints --- in much the same >>>> way that the Python standard library is benefitted by many versions of >>>> different capabilities being tried out before moving into the standard >>>> library. >>>> >>>> I remain optimistic that things will continue to improve in multiple >>>> ways --- if a little "messier" than any of us would conceive individually. >>>> It *is* great to see all the PR's coming from multiple people on NumPy >>>> and all the new energy around improving things whether great or small. >>>> >>> >>> @nathaniel IIRC, one of the objections to the missing values work was >>> that it changed the underlying array object by adding a couple of variables >>> to the structure. I'm willing to do that sort of thing, but it would be >>> good to have general agreement that that is acceptable. >>> >> >> >> I think changing the ABI for some versions of numpy (2.0 , whatever) is >> acceptable. There is little doubt that the ABI will need to change to >> accommodate a better and more flexible architecture. >> >> Changing the C API is more tricky: I am not up to date to the changes >> from the last 2-3 years, but at that time, most things could have been >> changed internally without breaking much, though I did not go far enough to >> estimate what the performance impact could be (if any). >> >> >> >>> As to blaze/dynd, I'd like to steal bits here and there, and maybe in >>> the long term base numpy on top of it with a compatibility layer. There is >>> a lot of thought and effort that has gone into those projects and we should >>> use what we can. As is, I think numpy is good for another five to ten years >>> and will probably hang on for fifteen, but it will be outdated by the end >>> of that period. Like great whites, we need to keep swimming just to have >>> oxygen. Software projects tend to be obligate ram ventilators. >>> >>> The Python 3 experience is definitely something we want to avoid. And >>> while blaze does big data and offers some nice features, I don't know that >>> it offers compelling reasons to upgrade to the more ordinary user at this >>> time, so I'd like to sort of slip it into numpy if possible. >>> >>> If we do start moving numpy forward in more radical steps, we should try >>> to have some agreement beforehand as to what sort of changes are >>> acceptable. For instance, to maintain backward compatibility, is it >>> sufficient that a recompile will do the job, or do we require forward >>> compatibility for extensions compiled against earlier releases? Do we stay >>> with C or should we support C++ code with its advantages of smart pointers, >>> exception handling, and templates? We will need a certain amount of >>> flexibility going forward and we should decide, or at least discuss, such >>> issues up front. >>> >> >> Last time the C++ discussion was brought up, no consensus could be made. >> I think quite a few radical changes can be made without that consensus >> already, though other may disagree there. >> >> IMO, what is needed the most is refactoring the internal to extract the >> Python C API low level from the rest of the code, as I think that's the >> main bottleneck to get more contributors (or get new core features more >> quickly). >> >> > What do you mean by "extract the Python C API"? > Poor choice of words: I meant extracting the lower level part of array/ufunc/etc... from its wrapping into the python C API (with the idea that the latter could be done in Cython, modulo improvements in cython to manage the binary/code size explosion). IOW, split numpy into core and core-py (I think dynd benefits a lots from that, on top of its feature set). David > > Chuck > > _______________________________________________ > NumPy-Discussion mailing list > [email protected] > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list [email protected] http://mail.scipy.org/mailman/listinfo/numpy-discussion
