On Fri, 2022-08-19 at 23:56 +, lpsm...@uw.edu wrote:
> Thanks for the information! I've had to work on other project in the
> meantime, but was able to get back to this again.
>
> In an effort to wrap my head around the project's code, I realized
> that I did not have a line like:
>
> #define NPY_NO_DEPRECATED_API NPY_1_7_API_VERSION
>
> in it. So, I added the line, fixed the errors the resulted, and
> recompiled. And immediately got segmentation faults.
>
With it, the fields are hidden completely which is the intention. But
that also means the size is wrong for subclassing. You would have to
use `PyArrayObject_fields` although that basically circumvents the
deprecation, it somehwat makes sense, you should just only use it in
that one place I guess (not for actual access to strides).
Overall, I am not sure if this will ever help us much, but the solution
seems simple here. There should be no fundamental changes with the
exception of the size of `PyArrayObject_fields`.
- Sebastian
> Investigating, I discovered that the problem was that the pointers to
> my new member variables change mysteriously to invalid values. The
> basic flow is:
>
> 1) I call PyArray_New with a pointer to my new NamedArrayObject
> subclass.
> 2) It does stuff
> 3) It calls the NamedArrayObject_alloc function we wrote, which
> creates the new 'rowNames' PyList.
> 4) It does more stuff
> 5) It calls the NamedArrayObject_Finalize function we wrote. The
> 'rowNames' object now has a different, and incorrect, pointer, and at
> this point if I ever call PyList_Size(self->rowNames), I get a
> segmentation fault.
>
> If, however, I re-comment out the '#define NPY_NO_DEPRECATED_API
> NPY_1_7_API_VERSION" line, everything works fine! Part 4 above
> doesn't change the 'rowNames' pointer to anything, and it remains
> valid for the rest of the program.
>
> (If I change 1_7_ to 1_23_, I get identical behavior.)
>
> Normally, I'd be happy to just say 'well, I guess I have to use the
> deprecated API', but this issue combined with the fact that I'm
> getting the "C subclassed NumPy array, void scalar, or allocated in a
> non-standard way" error message, makes me think that something has
> changed in a fundamental way since the code was first written in
> 2014, and that I should do things the new way. I just... don't know
> what that new way might be ;-)
>
> The upshot is that I have two questions:
> 1) Does this look like a bug I should file?
> 2) What areas should I start looking into to change this old code to
> work with modern NumPy?
>
> and possibly:
>
> 3) Is there a worked example of a C-extended NumPy array somewhere I
> could steal?
>
> The full code is at
>
> https://github.com/sys-bio/roadrunner/blob/develop/wrappers/Python/roadrunner/PyUtils.cpp
>
> if that helps anyone. The branch where I'm trying to change things
> is at
>
> https://github.com/sys-bio/roadrunner/blob/update-numpy/wrappers/Python/roadrunner/PyUtils.cpp
>
> which has a bunch of print statements added, since I was working
> without a debugger on Windows. The relevant output for a simple
> Python script that calls this was (for the broken #define version):
> ```
> NA_N 0
> Debug: PyObject* rr::NamedArrayObject_alloc(PyTypeObject*,
> Py_ssize_t)
> rownames new ref 0x77362180
> rownames added to object 0x7747e750
> rownames size 0
> rownames ref 0x77362180
> rownames size 0
> Debug: namedArrayObject allocated: 0x7747e750
> Debug: namedArrayObject returned obj: 0x7747e750
> Debug: Done
>
> Debug: PyObject* rr::NamedArrayObject_Finalize(rr::NamedArrayObject*,
> PyObject*)
> rownames ref 0x55f4f830
> Debug: finalizing object self: 0x7747e750; args 0x55aca3e0
> rownames ref 0x55f4f830
> Debug: NamedArrayObject initialized from constructor. 'None' path
> taken
> Debug: PyObject*
> rr::NamedArrayObject_Finalize_FromConstructor(rr::NamedArrayObject*)
> rownames ref 0x55f4f830
> rownames ref 0x55f4f830
> Debug: Done
> ```
>
> and for the working non-#define version:
>
> ```
> NA_N 0
> Debug: PyObject* rr::NamedArrayObject_alloc(PyTypeObject*,
> Py_ssize_t)
> rownames new ref 0x77362140
> rownames added to object 0x7752e1e0
> rownames size 0
> rownames ref 0x77362140
> rownames size 0
> Debug: namedArrayObject allocated: 0x7752e1e0
> Debug: namedArrayObject returned obj: 0x7752e1e0
> Debug: Done
>
> Debug: PyObject* rr::NamedArrayObject_Finalize(rr::NamedArrayObject*,
> PyObject*)
> rownames ref 0x77362140
> Debug: finalizing object self: 0x7752e1e0; args 0x55aca3e0
> rownames ref 0x77362140
> Debug: NamedArrayObject initialized from constructor. 'None' path
> taken
> Debug: PyObject*
> rr::NamedArrayObject_Finalize_FromConstructor(rr::NamedArrayObject*)
> rownames ref 0x77362140
> rownames ref 0x77362140
> Debug: Done
> ```
>
> Thanks for bearing with me though this long message! And particular
> thanks to Sebastien for answering my initial questi