@kira-lin You are right, the two lines of code should be switched:
for (int elem_idx = 0; elem_idx <
(((int*)adj_indptr_placeholder)[((int)blockIdx.x) * 2) + row_inner) + 1))]
- ((int*)adj_indptr_placeholder)[(int)blockIdx.x) * 2) + row_inner))]);
++elem_idx) {
if (int
Hi all,
I am trying to build a SpMM kernel as following:
```python
import tvm
from tvm import te
import scipy
import scipy.sparse
feat_len = 128
num_rows = num_cols = 253
num_threads_per_block = 64
num_cuda_blocks = 127
SrcFeat = te.placeholder((num_cols, feat_len))
adj_scipy_csr = scipy