@sergei-grechanik that is what ppl do. the problem of parallel ad is shared 
read become shared write, but you can pull optimization tricks to turn shared 
write into a single write (for example, 
https://people.csail.mit.edu/tzumao/gradient_halide/).
I think this is a smarter approach, for when their optimization failed, they 
only have write sync, but when ours optimization failed, we have giant tensors.

-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/issues/1996#issuecomment-595998821

Reply via email to