Hi Kevin
As I wrote in my original mail, BLAS has no routine for this operation,
you need at least 2 BLAS (or numpy) operatios and I'd like to avoid the
drawback of those solutions because this is a very performance critical
and memory bound operation (X can be very large).
The undertaking of implementing a GEMM-like sandwich product exploiting
the symmetry, simd and thread-parallelism is difficult, but doable as
demonstrated, e.g., here:
https://github.com/Quantco/tabmat/blob/main/src/tabmat/ext/dense_helpers-tmpl.cpp
<https://github.com/Quantco/tabmat/blob/main/src/tabmat/ext/dense_helpers-tmpl.cpp>.
The question remains: Is this something that numpy could provide?
Best
Christian
On 04.09.2025 14:19, Kevin Sheppard wrote:
A lighter alternative would be for a downstream project to use
Cython + SciPy to create a small binary module that mimics the numba
code but uses LAPACK directly. It isn't particularly difficult to
write the equivalent code, and we use a lot of this in statsmodels.
You then would only need to distribute a relatively small binary, and
the rest would be already included in the install.
Kevin
On Tue, Sep 2, 2025 at 9:02 PM Christian Lorentzen
<[email protected]> wrote:
As nice as numba is, it’s a very heavy dependency that many high
profile libraries want to avoid.
Best
Christian
Am 18.08.2025 um 14:25 schrieb Kevin Sheppard
<[email protected]>:
This is a good place for numba if performance and memory use are
a concern.
Good NumPy
125 ms ± 2.42 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)
Naive numba (memory efficient)
386 ms ± 1.83 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)
Parallel numba (memory efficient, cpu bounded)
46.8 ms ± 1.12 ms per loop (mean ± std. dev. of 7 runs, 10 loops
each)
Here is a gist that produced the timings:
https://gist.github.com/bashtage/39c50b223580b9bc3f763a3b1be3e478
All were on a 12 core Ryzen.
Kevin
On Mon, Aug 18, 2025 at 12:35 PM Christian Lorentzen via
NumPy-Discussion <[email protected]> wrote:
If W is just a vector, i.e. my main use case, than what you
write is what I wanted to imply and actually how I
implemented it in several cases.
The point is that X.T * weights[None, :] still doubles memory
(same size as X) and this solution doesn't exploit the symmetry.
Best
Christian
On 17.08.2025 22:13, Kevin Jacobs wrote:
Why not:
XTWX = (X.T * weights[None, :]) @ X
if `weights` is a vector? Allocating `diag(weights)` is the
big memory and CPU hog at least for my generally long and
skinny X matrices.
-Kevin
On Fri, Aug 15, 2025 at 10:08 AM Christian Lorentzen via
NumPy-Discussion <[email protected]> wrote:
Dear numpy community
I would like to propose a mew feature in np.linalg to
compute weighted gram matrix (sandwich product), see
https://github.com/numpy/numpy/issues/29559.
Proposed new feature or change:
The solvers for many important statistical models like
Generalized Linear Models often need to compute |X.T @ W
@ X| where |X| is a 2-dimensional array of features
(features in columns, observations in rows) and |W| is
very often a diagonal array of (current) weights. This
computation is usually the main computational bottleneck.
It would be great if numpy could provide an efficient
implementation of it, e.g.
|np.linalg.sandwicht_product(X, weight=w)|.
Why numpy? It seems even more unrealistic to me, to get
it into BLAS implementations. Numpy has support for SIMD
via Highways.
Computational alternatives
Drawbacks of |(X.T * diag(W)) @ X|:
* Additional memory allocation to compute |X.T @
diag(W)| , same size as (usually large) |X|.
* The result is symmetric, but this fact is not used.
So at least a factor of 2 is possible.
Note that without weights, |X.T @ X| uses BLAS syrk.
But the weights are crucial.
Drawback of |Z = np.sqrt(diag(W))[:, None] * X| and then
|Z @ Z|:
* Additional memory allocation for |Z|, same size as
(usually large) |X|.
* Taking square roots for possibly large |diag(W)|
Additional information
https://github.com/Quantco/tabmat has an implementation
of it with XSIMD.
Best
Christian Lorentzen
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to
[email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]
_______________________________________________
NumPy-Discussion mailing list [email protected]
To unsubscribe send an email [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address:[email protected]
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]
_______________________________________________
NumPy-Discussion mailing list -- [email protected]
To unsubscribe send an email to [email protected]
https://mail.python.org/mailman3//lists/numpy-discussion.python.org
Member address: [email protected]