[quote="JosseVanDelm, post:1, topic:9335"]
```
    Ab = tvm.tir.decl_buffer(a.shape, a.dtype, name="A", offset_factor=1, 
strides=[2,1])
    Bb = tvm.tir.decl_buffer(b.shape, b.dtype, name="B", offset_factor=1, 
strides=[2,1])
    Cb = tvm.tir.decl_buffer(c.shape, c.dtype, name="C", offset_factor=1, 
strides=[2,1])
```
[/quote]

Hi, you specified the wrong strides. try:
```python
Ab = tvm.tir.decl_buffer(a.shape, a.dtype, name="A", offset_factor=1, 
strides=[4, 1])
Bb = tvm.tir.decl_buffer(b.shape, b.dtype, name="B", offset_factor=1, 
strides=[4, 1])
Cb = tvm.tir.decl_buffer(c.shape, c.dtype, name="C", offset_factor=1, 
strides=[4,1])
```

The original IR is
```python
primfn(A_1: handle, B_1: handle, C_1: handle) -> ()
  attr = {"global_symbol": "main", "tir.noalias": True}
  buffers = {C: Buffer(C_2: Pointer(float32), float32, [4, 4], []),
             B: Buffer(B_2: Pointer(float32), float32, [4, 4], []),
             A: Buffer(A_2: Pointer(float32), float32, [4, 4], [])}
  buffer_map = {A_1: A, B_1: B, C_1: C} {
  for (i.outer: int32, 0, 2) {
    for (j.outer: int32, 0, 2) {
      for (i.inner: int32, 0, 2) {
        for (j.inner: int32, 0, 2) {
          C_2[((((i.outer*8) + (i.inner*4)) + (j.outer*2)) + j.inner)] = 
((float32*)A_2[((((i.outer*8) + (i.inner*4)) + (j.outer*2)) + j.inner)] + 
(float32*)B_2[((((i.outer*8) + (i.inner*4)) + (j.outer*2)) + j.inner)])
        }
      }
    }
  }
}
```
As you can see, the coeff of i.inner is 4, i.inner*4 is 1. These two-axis are 
the Tensorize region. They will be verified in the bind buffer step in 
StorageFlatten Pass.





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/te-tensorize-elementwise-sum/9335/2) to 
respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/51acb41e5a2542204e7523308d7d6000bab44447f6d99a85cc4b06a31b6eeaaa).

Reply via email to