Thanks for this, every little helps.
One more thing to mention on this topic.
From a certain size dot product becomes faster than sum (due to parallelisation
I guess?).
E.g.
def dotsum(arr):
a = arr.reshape(1000, 100)
return a.dot(np.ones(100)).sum()
a = np.ones(100000)
In [45]: %timeit np.add.reduce(a, axis=None)
42.8 µs ± 2.44 µs per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
In [43]: %timeit dotsum(a)
26.1 µs ± 718 ns per loop (mean ± std. dev. of 7 runs, 10,000 loops each)
But theoretically, sum, should be faster than dot product by a fair bit.
Isn’t parallelisation implemented for it?
Regards,
DG
> On 16 Feb 2024, at 01:37, Marten van Kerkwijk <[email protected]> wrote:
>
> It is more that np.sum checks if there is a .sum() method and if so
> calls that. And then `ndarray.sum()` calls `np.add.reduce(array)`.
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3/lists/numpy-discussion.python.org/
Member address: [email protected]