On Sat, May 19, 2012 at 10:00 AM, David Cournapeau <courn...@gmail.com>wrote:
> On Sat, May 19, 2012 at 3:17 PM, Charles R Harris < > charlesr.har...@gmail.com> wrote: > >> On Fri, May 18, 2012 at 3:47 PM, Travis Oliphant <tra...@continuum.io>wrote: >> >>> Hey all, >>> >>> After reading all the discussion around masked arrays and getting input >>> from as many people as possible, it is clear that there is still >>> disagreement about what to do, but there have been some fruitful >>> discussions that ensued. >>> >>> This isn't really new as there was significant disagreement about what >>> to do when the masked array code was initially checked in to master. So, >>> in order to move forward, Mark and I are going to work together with >>> whomever else is willing to help with an effort that is in the spirit of my >>> third proposal but has a few adjustments. >>> >>> The idea will be fleshed out in more detail as it progresses, but the >>> basic concept is to create an (experimental) ndmasked object in NumPy 1.7 >>> and leave the actual ndarray object unchanged. While the details need to >>> be worked out here, a goal is to have the C-API work with both ndmasked >>> arrays and arrayobjects (possibly by defining a base-class C-level >>> structure that both ndarrays inherit from). This might also be a good >>> way for Dag to experiment with his ideas as well but that is not an >>> explicit goal. >>> >>> One way this could work, for example is to have PyArrayObject * be the >>> base-class array (essentially the same C-structure we have now with a >>> HASMASK flag). Then, the ndmasked object could inherit from PyArrayObject * >>> as well but add more members to the C-structure. I think this is the >>> easiest thing to do and requires the least amount of code-change. It >>> is also possible to define an abstract base-class PyArrayObject * that both >>> ndarray and ndmasked inherit from. That way ndarray and ndmasked are >>> siblings even though the ndarray would essentially *be* the PyArrayObject * >>> --- just with a different type-hierarchy on the python side. >>> >>> This work will take some time and, therefore, I don't expect 1.7 to be >>> released prior to SciPy Austin with an end of June target date. The >>> timing will largely depend on what time is available from people interested >>> in resolving the situation. Mark and I will have some availability for >>> this work in June but not a great deal (about 2 man-weeks total between >>> us). If there are others who can step in and help, it will help >>> accelerate the process. >>> >>> >> This will be a difficult thing for others to help with since the concept >> is vague, the design decisions seem to be in your and Mark's hands, and you >> say you don't have much time. It looks to me like 1.7 will keep slipping >> and I don't think that is a good thing. Why not go for option 2, which will >> get 1.7 out there and push the new masked array work in to 1.8? Breaking >> the flow of development and release has consequences, few of them good. >> > > Agreed. 1.6.0 was released one year ago already, let's focus on polishing > what's in there *now*. I have not followed closely what the decision was > for a LTS release, but if 1.7 is supposed to be it, that's another argument > about changing anything there for 1.7. > The motivation behind splitting the mask out into a separate ndmasked is primarily so that pre-existing code will not silently function on NA-masked arrays and produce incorrect results. This centres around using PyArray_DATA to get at the data after manually checking flags, instead of calling PyArray_FromAny. Maybe a reasonable solution is to tweak the behavior of PyArray_DATA? It could work as follows: - If an ndarray has no mask, PyArray_DATA returns the data pointer as it does currently. - If the ndarray has an NA-mask, PyArray_DATA sets an exception and returns NULL - Create a new accessor, PyArray_DATAPTR or PyArray_RAWDATA, which returns the array data under all circumstances. This way, code which currently uses the data pointer through PyArray_DATA will fail instead of silently working with the wrong interpretation of the data. What do people feel about this idea? -Mark > David > > _______________________________________________ > NumPy-Discussion mailing list > NumPy-Discussion@scipy.org > http://mail.scipy.org/mailman/listinfo/numpy-discussion > >
_______________________________________________ NumPy-Discussion mailing list NumPy-Discussion@scipy.org http://mail.scipy.org/mailman/listinfo/numpy-discussion