If you want one or two specific configurations to work with, they would be: - batch size = 12 (but `batch_matmul_schedule` didn't require a constant batch size, so maybe this doesn't need to be constant) - embedding size: 768 - sequence length: 4,096 - window size: 512 - dilation: 0 and 3 (I think a lot of the locality assumptions for caching will break once we start working with non-zero dilation. That's why we need to study both cases, 0 because it is the most common, and 3 because it is representative of the cases where locality breaks)
--- [Visit Topic](https://discuss.tvm.ai/t/developing-a-faster-schedule-for-longformers-kernel/6367/4) to respond. You are receiving this because you enabled mailing list mode. To unsubscribe from these emails, [click here](https://discuss.tvm.ai/email/unsubscribe/d5844d68115645773296e165be301506bcda1d2e6a543a5c11898b7b19dee893).