[Numpy-discussion] Re: Tricky ufunc implementation question
Do the JVM-based pythons solve any threading issues? Plain parallel java seems indispensable. Bill -- Phobrain.com On 2025-07-03 05:05, Nathan via NumPy-Discussion wrote: > If a NumPy array is shared between two threads, NumPy doesn't do anything to > synchronize array access. This is true in all Python versions and build > configurations - since NumPy releases the GIL during most array operations > whether or not you're using free-threaded Python doesn't change much except > for e.g. object arrays, which do hold the GIL. > > See: > https://numpy.org/doc/stable/reference/thread_safety.html > > IMO you probably shouldn't try to enforce more strict thread safety than > NumPy itself does. > > We didn't add any locking to support free-threaded Python because it's always > worked like this, and introducing locking might lead to performance > bottlenecks in read-only multithreaded applications and would substantially > increase NumPy's internal complexity. > > Long-term, I'd like to see more effort put towards adding stronger guarantees > around freezing arrays. I also want to look closer at adding runtime checks > to detect races and report them. One example: you could imagine each array > having an internal "version" counter that is incremented every time the array > is mutated. Doing an atomic read on the version before and after a mutation > should hopefully have a small overhead compared with the rest of NumPy, and > we could report runtime errors when arrays are mutated "underneath" a thread > doing an operation. > > The devil is in the details though - there are *a lot* of ways to mutate > NumPy arrays. This also doesn't consider the buffer protocol or accessing > arrays via third-party C extensions. See e.g. Alex Gaynor's blog post on this > from the perspective of Rust and PyO3: > > https://alexgaynor.net/2022/oct/23/buffers-on-the-edge/ > > On Thu, Jul 3, 2025 at 5:50 AM Benjamin Root via NumPy-Discussion > wrote: > > On a related note, does numpy's gufunc mechanism provide any thread safety, > or is the responsibility on the extension writer to do that? For simple numpy > array inputs, I would think that I don't have to worry about free-threaded > python messing things up (unless I have a global state), I'm wondering if > something like dask array inputs could mess up calls to a thread-unsafe > function. > > If it is on the extension writer, are there any examples on how to do that? > Are there other guarantees (or lack thereof) that a gufunc writer should be > aware of? How about reorderability? gufuncs operates on subarrays, so > wouldn't dask inputs that are chunked potentially operate on the chunks in > any order they like? > > Thanks, > Ben Root > > On Tue, Jul 1, 2025 at 4:26 PM Benjamin Root wrote: > > Warren, > > The examples in ufunclab helped clear up a few things and I was able to > experiment and get a working gufunc! Thank you for your help! > > Ben Root > > On Fri, Jun 27, 2025 at 8:54 PM Benjamin Root wrote: > > Warren, > > I'm fine with implementing it in C. I just didn't think gufuncs were for me. > I couldn't tell from the description if it would be for my usecase since I > wasn't looping over subarrays, and I didn't see any good examples. Maybe the > documentation could be clearer. I'll have a look at your examples. > > I did try that signature with np.vectorize() with the signature keyword > argument, but it didn't seem to work. Maybe it didn't work for the reasons in > that open issue. > > Thank you, > Ben Root > > On Fri, Jun 27, 2025 at 8:03 PM Warren Weckesser via NumPy-Discussion > wrote: On Fri, Jun 27, 2025 at 5:29 PM Benjamin > Root via NumPy-Discussion > wrote: >> >> I'm looking at a situation where I like to wrap a C++ function that takes >> two doubles as inputs, and returns an error code, a position vector, and a >> velocity vector so that I essentially would have a function signature of >> (N), (N) -> (N), (N, 3), (N, 3). When I try to use np.vectorize() or >> np.frompyfunc() on the python version of this function, I keep running into >> issues where it wants to make the outputs into object arrays of tuples. And >> looking at utilizing PyUFunc_FromFuncAndData, it isn't clear to me how I can >> tell it to expect those two output arrays to have a size 3 outer dimension. >> >> Are ufuncs the wrong thing here? How should I go about this? Is it even >> possible? > > Ben, > > It looks like the simplest signature for your core operation would be > (),()->(),(3),(3), with broadcasting taking care of higher dimensional > inputs. Because not all the core shapes are scalars, that would > require a *generalized* ufunc (gufunc). There is an open issue > (https://github.com/numpy/numpy/issues/14020) with a request for a > function to generate a gufunc from a Python function. > > numba has the @guvectorize decorator, but I haven't use it much, and > in my few quick attempts just
[Numpy-discussion] Re: Tricky ufunc implementation question
If a NumPy array is shared between two threads, NumPy doesn’t do anything to synchronize array access. This is true in all Python versions and build configurations - since NumPy releases the GIL during most array operations whether or not you’re using free-threaded Python doesn’t change much except for e.g. object arrays, which do hold the GIL. See: https://numpy.org/doc/stable/reference/thread_safety.html IMO you probably shouldn’t try to enforce more strict thread safety than NumPy itself does. We didn’t add any locking to support free-threaded Python because it’s always worked like this, and introducing locking might lead to performance bottlenecks in read-only multithreaded applications and would substantially increase NumPy’s internal complexity. Long-term, I’d like to see more effort put towards adding stronger guarantees around freezing arrays. I also want to look closer at adding runtime checks to detect races and report them. One example: you could imagine each array having an internal “version” counter that is incremented every time the array is mutated. Doing an atomic read on the version before and after a mutation should hopefully have a small overhead compared with the rest of NumPy, and we could report runtime errors when arrays are mutated “underneath” a thread doing an operation. The devil is in the details though - there are *a lot* of ways to mutate NumPy arrays. This also doesn’t consider the buffer protocol or accessing arrays via third-party C extensions. See e.g. Alex Gaynor’s blog post on this from the perspective of Rust and PyO3: https://alexgaynor.net/2022/oct/23/buffers-on-the-edge/ On Thu, Jul 3, 2025 at 5:50 AM Benjamin Root via NumPy-Discussion < numpy-discussion@python.org> wrote: > On a related note, does numpy's gufunc mechanism provide any thread > safety, or is the responsibility on the extension writer to do that? For > simple numpy array inputs, I would think that I don't have to worry about > free-threaded python messing things up (unless I have a global state), I'm > wondering if something like dask array inputs could mess up calls to a > thread-unsafe function. > > If it is on the extension writer, are there any examples on how to do > that? Are there other guarantees (or lack thereof) that a gufunc writer > should be aware of? How about reorderability? gufuncs operates on > subarrays, so wouldn't dask inputs that are chunked potentially operate on > the chunks in any order they like? > > Thanks, > Ben Root > > > On Tue, Jul 1, 2025 at 4:26 PM Benjamin Root wrote: > >> Warren, >> >> The examples in ufunclab helped clear up a few things and I was able to >> experiment and get a working gufunc! Thank you for your help! >> >> Ben Root >> >> On Fri, Jun 27, 2025 at 8:54 PM Benjamin Root >> wrote: >> >>> Warren, >>> >>> I'm fine with implementing it in C. I just didn't think gufuncs were for >>> me. I couldn't tell from the description if it would be for my usecase >>> since I wasn't looping over subarrays, and I didn't see any good examples. >>> Maybe the documentation could be clearer. I'll have a look at your examples. >>> >>> I did try that signature with np.vectorize() with the signature keyword >>> argument, but it didn't seem to work. Maybe it didn't work for the reasons >>> in that open issue. >>> >>> Thank you, >>> Ben Root >>> >>> On Fri, Jun 27, 2025 at 8:03 PM Warren Weckesser via NumPy-Discussion < >>> numpy-discussion@python.org> wrote: >>> On Fri, Jun 27, 2025 at 5:29 PM Benjamin Root via NumPy-Discussion wrote: > > I'm looking at a situation where I like to wrap a C++ function that takes two doubles as inputs, and returns an error code, a position vector, and a velocity vector so that I essentially would have a function signature of (N), (N) -> (N), (N, 3), (N, 3). When I try to use np.vectorize() or np.frompyfunc() on the python version of this function, I keep running into issues where it wants to make the outputs into object arrays of tuples. And looking at utilizing PyUFunc_FromFuncAndData, it isn't clear to me how I can tell it to expect those two output arrays to have a size 3 outer dimension. > > Are ufuncs the wrong thing here? How should I go about this? Is it even possible? Ben, It looks like the simplest signature for your core operation would be (),()->(),(3),(3), with broadcasting taking care of higher dimensional inputs. Because not all the core shapes are scalars, that would require a *generalized* ufunc (gufunc). There is an open issue (https://github.com/numpy/numpy/issues/14020) with a request for a function to generate a gufunc from a Python function. numba has the @guvectorize decorator, but I haven't use it much, and in my few quick attempts just now, it appeared to not accept fixed integer sizes in the output shape. But wait to see if any numba gurus respond with a definitive a
[Numpy-discussion] Re: Tricky ufunc implementation question
On a related note, does numpy's gufunc mechanism provide any thread safety, or is the responsibility on the extension writer to do that? For simple numpy array inputs, I would think that I don't have to worry about free-threaded python messing things up (unless I have a global state), I'm wondering if something like dask array inputs could mess up calls to a thread-unsafe function. If it is on the extension writer, are there any examples on how to do that? Are there other guarantees (or lack thereof) that a gufunc writer should be aware of? How about reorderability? gufuncs operates on subarrays, so wouldn't dask inputs that are chunked potentially operate on the chunks in any order they like? Thanks, Ben Root On Tue, Jul 1, 2025 at 4:26 PM Benjamin Root wrote: > Warren, > > The examples in ufunclab helped clear up a few things and I was able to > experiment and get a working gufunc! Thank you for your help! > > Ben Root > > On Fri, Jun 27, 2025 at 8:54 PM Benjamin Root > wrote: > >> Warren, >> >> I'm fine with implementing it in C. I just didn't think gufuncs were for >> me. I couldn't tell from the description if it would be for my usecase >> since I wasn't looping over subarrays, and I didn't see any good examples. >> Maybe the documentation could be clearer. I'll have a look at your examples. >> >> I did try that signature with np.vectorize() with the signature keyword >> argument, but it didn't seem to work. Maybe it didn't work for the reasons >> in that open issue. >> >> Thank you, >> Ben Root >> >> On Fri, Jun 27, 2025 at 8:03 PM Warren Weckesser via NumPy-Discussion < >> numpy-discussion@python.org> wrote: >> >>> On Fri, Jun 27, 2025 at 5:29 PM Benjamin Root via NumPy-Discussion >>> wrote: >>> > >>> > I'm looking at a situation where I like to wrap a C++ function that >>> takes two doubles as inputs, and returns an error code, a position vector, >>> and a velocity vector so that I essentially would have a function signature >>> of (N), (N) -> (N), (N, 3), (N, 3). When I try to use np.vectorize() or >>> np.frompyfunc() on the python version of this function, I keep running into >>> issues where it wants to make the outputs into object arrays of tuples. And >>> looking at utilizing PyUFunc_FromFuncAndData, it isn't clear to me how I >>> can tell it to expect those two output arrays to have a size 3 outer >>> dimension. >>> > >>> > Are ufuncs the wrong thing here? How should I go about this? Is it >>> even possible? >>> >>> Ben, >>> >>> It looks like the simplest signature for your core operation would be >>> (),()->(),(3),(3), with broadcasting taking care of higher dimensional >>> inputs. Because not all the core shapes are scalars, that would >>> require a *generalized* ufunc (gufunc). There is an open issue >>> (https://github.com/numpy/numpy/issues/14020) with a request for a >>> function to generate a gufunc from a Python function. >>> >>> numba has the @guvectorize decorator, but I haven't use it much, and >>> in my few quick attempts just now, it appeared to not accept fixed >>> integer sizes in the output shape. But wait to see if any numba gurus >>> respond with a definitive answer about whether or not it can handle >>> the shape signature (),()->(),(3),(3). >>> >>> You could implement the gufunc in a C or C++ extension module, if you >>> don't mind the additional development effort and packaging hassle. I >>> know that works--I've implemented quite a few gufuncs in ufunclab >>> (https://github.com/WarrenWeckesser/ufunclab). >>> >>> Warren >>> >>> >>> > >>> > Thanks in advance, >>> > Ben Root >>> > ___ >>> > NumPy-Discussion mailing list -- numpy-discussion@python.org >>> > To unsubscribe send an email to numpy-discussion-le...@python.org >>> > https://mail.python.org/mailman3//lists/numpy-discussion.python.org >>> > Member address: warren.weckes...@gmail.com >>> ___ >>> NumPy-Discussion mailing list -- numpy-discussion@python.org >>> To unsubscribe send an email to numpy-discussion-le...@python.org >>> https://mail.python.org/mailman3//lists/numpy-discussion.python.org >>> Member address: ben.v.r...@gmail.com >>> >> ___ NumPy-Discussion mailing list -- numpy-discussion@python.org To unsubscribe send an email to numpy-discussion-le...@python.org https://mail.python.org/mailman3//lists/numpy-discussion.python.org Member address: arch...@mail-archive.com