Sorry for the complicated code. I didn't know how to compile multiple kernels 
into one `.so` file so ended up cramming three functions (one for the forward 
pass and 2 for the backward) into one with flags to switch between them. 
Here are the constants for the forward pass
```
b = 1  # batch size
n = 4096  # sequence length
h = 12  # number of heads (this dimension can be merged with the batch size if 
needed)
m = 768  # hidden dimension -> 768
w = 256  # window size on one side
w_upper = 256  # window size to the right of the word. Should be `w` for the 
non-autoregressive case
padding = 0 # padding -> any const
transpose_t1 = 0  # `0` for one of the backward functions and `1` for the 
other, doesn't matter for the forward
t1d3 = 768  # last dimension of t1 -> this is `m` for the forward function and 
`2w+1` (number of diagonals) for the backward
t3d3 = 513  # last dimensions of t3, this is 2w+1 for the forward pass
```





---
[Visit 
Topic](https://discuss.tvm.ai/t/developing-a-faster-schedule-for-longformers-kernel/6367/6)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/e1355423711038c18285628d43310b696a4fd0a407d09d2255da63dd68ebd2b6).

Reply via email to