@kira-lin You are right, the two lines of code should be switched:

    for (int elem_idx = 0; elem_idx < 
(((int*)adj_indptr_placeholder)[((((((int)blockIdx.x) * 2) + row_inner) + 1))] 
- ((int*)adj_indptr_placeholder)[(((((int)blockIdx.x) * 2) + row_inner))]); 
++elem_idx) {
        if (((((int)blockIdx.x) * 2) + row_inner) < 253) {

A quick workaround is to disable row partitioning, that is, we set 
num_cuda_blocks to be num_rows. This schedule gives good performance in 
practice.





---
[Visit Topic](https://discuss.tvm.ai/t/tvm-access-beyond-array-boundary/6998/2) 
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/e6fbc8c7108911faefb4d2b90492742ededc445a2513d74b30df2d3737752a90).

Reply via email to