What I mean is that runtime need to be aware of the memory layout and provide 
out[slice] = f(inputs). Another possible "obstacle" is that TVM's compute 
kernel requires the buffer to be somewhat aligned, and we need to generate a 
special kernel for ```out[slice] = f(inputs)```, with a known offset(so we 
still benefit from good alignment). This is necessary for OpenCL



-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/2975#issuecomment-480441459

Reply via email to