Right now I am using np.sum(A * B.T, axis=1) for dense data and I have
implemented a Cython routine for sparse data.
I haven't benched np.sum(A * B.T, axis=1) vs. np.einsum("ij,ji->i", A, B)
yet since I am mostly interested in the sparse case right now.

When A and B are C-style and Fortran-style, the optimal algorithm should be
computing the inner products along the diagonal using BLAS.
If not, I guess this will need some benchmarking.

Another use for this is to compute the row-wise L2 norms: np.diagdot(A,
A.T).

Mathieu

On Fri, May 22, 2015 at 5:58 PM, David Cournapeau <[email protected]>
wrote:

>
>
> On Fri, May 22, 2015 at 5:39 PM, Mathieu Blondel <[email protected]>
> wrote:
>
>> Hi,
>>
>> I often need to compute the equivalent of
>>
>> np.diag(np.dot(A, B)).
>>
>> Computing np.dot(A, B) is highly inefficient if you only need the
>> diagonal entries. Two more efficient ways of computing the same thing are
>>
>> np.sum(A * B.T, axis=1)
>>
>> and
>>
>> np.einsum("ij,ji->i", A, B).
>>
>> The first can allocate quite a lot of temporary memory.
>> The second can be quite cryptic for someone not familiar with einsum.
>> I assume that einsum does not compute np.dot(A, B), but I haven't
>> verified.
>>
>> Since this is is quite a recurrent pattern, I was wondering if it would
>> be worth adding a dedicated function to NumPy and SciPy's sparse module. A
>> possible name would be "diagdot". The best performance would be obtained
>> when A is C-style and B fortran-style.
>>
>
> Does your implementation use BLAS, or is just a a wrapper around einsum ?
>
> David
>
>
> _______________________________________________
> NumPy-Discussion mailing list
> [email protected]
> http://mail.scipy.org/mailman/listinfo/numpy-discussion
>
>
_______________________________________________
NumPy-Discussion mailing list
[email protected]
http://mail.scipy.org/mailman/listinfo/numpy-discussion

Reply via email to