Hi @junrushao1994 , an over-simplified example from the industrial context is the following: ```python ... B0 = tvm.compute((m,n), lambda i,j: A0[i,j] + 2*A1[i,j], name = "B0") C0 = tvm.compute((m,n), lambda i,j: A0[i,j] + 2*A1[i,j], name = "C0") D0 = tvm.compute((m,n), lambda i,j: B0[i,j] + 3*C0[i,j], name = "D0") ... ``` The customized TVM will schedule and use `compute_at` to the extreme, and transform into something like ```cpp ... for (i, 0, m) { for (j, 0, n) { B0[0] = (A0[((i*stride) + (j*stride))] + (2f*A1[((i*stride) + (j*stride))])) C0[0] = (A0[((i*stride) + (j*stride))] + (2f*A1[((i*stride) + (j*stride))])) D0[((i*stride) + (j*stride))] = (B0[0] + (2f*C0[0])) }} ... ```
This gives our 'incomplete' CSE and Copy Propagation a chance to make the C0 assigned by B0 and replace C0’s appearance in D0 into B0 and make C0 dead (or not? dependent on the future). ```cpp ... for (i, 0, m) { for (j, 0, n) { B0[0] = (A0[((i*stride) + (j*stride))] + (2f*A1[((i*stride) + (j*stride))])) C0[0] = B0[0] D0[((i*stride) + (j*stride))] = (B0[0] + (2f*B0[0])) }} ... ``` The above ‘incomplete’ CSE and Copy Propagation pass can do things safely in a straight-line code in a small range (without data-flow analysis), but the same thing did not happen for dead code elimination – if we don’t know any live information out of this for loop, we cannot just eliminate the assignment to C0[0]. Generally speaking, dead code can arise after copy propagation and how dead code arises in TVM is similar to how they arise in LLVM and traditional compiler passes. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/issues/4468#issuecomment-562559678