Welcome to the TVM community :)

Mali doesn't really have an equivalent to Nvidia's shared memory, it uses the 
system RAM backed by an unconfigurable cache. Local is just OpenCL's term for 
CUDA's shared. This means that using explicit cache read/writes to shared/local 
aren't advised when optimising for Mali.

As to explicitly generating vectorize instructions, that will depend on the 
architecture in question. Post-Midgard GPUs should not require it (other than 
perhaps vectorizing load/stores).





---
[Visit 
Topic](https://discuss.tvm.ai/t/rfc-ansor-an-auto-scheduler-for-tvm-autotvm-v2-0/7005/28)
 to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.ai/email/unsubscribe/b0a63771af3784f725c0c3e1b4a449f8aad246d373bd58610906df5ed54c795c).

Reply via email to