(tvm) branch main updated: [BugFix][Relax] Select target-specific pipeline in tvm.compile when GPU target is provided (#19384)

tlopex Sat, 11 Apr 2026 13:04:44 -0700

This is an automated email from the ASF dual-hosted git repository.

tlopex pushed a commit to branch main
in repository https://gitbox.apache.org/repos/asf/tvm.git



The following commit(s) were added to refs/heads/main by this push:
     new b9ced1a078 [BugFix][Relax] Select target-specific pipeline in 
tvm.compile when GPU target is provided (#19384)
b9ced1a078 is described below

commit b9ced1a07855c391ab31ca9277dbfe8f5ad49511
Author: Soowon Jeong <[email protected]>
AuthorDate: Sun Apr 12 05:04:30 2026 +0900

    [BugFix][Relax] Select target-specific pipeline in tvm.compile when GPU 
target is provided (#19384)
    
    ## Problem
    
    `relax.build()` (exposed as `tvm.compile`) with
    `relax_pipeline="default"` always
    resolved to `default_build_pipeline`, regardless of the target.
    `default_build_pipeline` does not include DLight scheduling — it is a
    target-agnostic lowering pipeline. On CUDA, this left TIR functions
    generated
    from ops like `Clip`/`ReLU6` without thread bindings, causing
    `VerifyMemory` to fail:
    
    ```
    Memory verification failed: Variable `X` is directly accessed by host memory
    (it is not contained in a thread environment or in the function arguments).
    Did you forget to bind?
    ```
    
    ## Fix
    
    When `relax_pipeline="default"` and the target is a GPU target
    (`"gpu" in target.keys`), use `relax.get_default_pipeline(target)` which
    includes target-aware DLight scheduling. Fall back to
    `default_build_pipeline`
    if no target-specific pipeline is registered.
    
    CPU targets (`llvm`, `c`) continue to use `default_build_pipeline`
    unchanged.
    The CPU-specific pipeline adds `FuseOps`/`FuseTIR`/`FoldConstant` on
    top,
    which can DCE `call_pure_packed` calls whose results are unused —
    correct
    per pure semantics, but a separate concern from this fix.
---
 python/tvm/relax/backend/cpu_generic/pipeline.py |  5 ++++-
 python/tvm/relax/vm_build.py                     | 16 +++++++++++++++-
 2 files changed, 19 insertions(+), 2 deletions(-)

diff --git a/python/tvm/relax/backend/cpu_generic/pipeline.py 
b/python/tvm/relax/backend/cpu_generic/pipeline.py
index dc078ee25d..d0b819cea7 100644
--- a/python/tvm/relax/backend/cpu_generic/pipeline.py
+++ b/python/tvm/relax/backend/cpu_generic/pipeline.py
@@ -22,7 +22,10 @@ from tvm import relax
 
 def library_dispatch_passes(target: tvm.target.Target):  # pylint: 
disable=unused-argument
     """The default library dispatch passes for CPU backend."""
-    return []
+    return [
+        relax.backend.DispatchSampling(),
+        relax.backend.DispatchSortScan(),
+    ]
 
 
 def legalize_passes(target: tvm.target.Target):  # pylint: 
disable=unused-argument
diff --git a/python/tvm/relax/vm_build.py b/python/tvm/relax/vm_build.py
index 68592d67f8..adc0f7ad83 100644
--- a/python/tvm/relax/vm_build.py
+++ b/python/tvm/relax/vm_build.py
@@ -248,7 +248,21 @@ def build(
 
     if relax_pipeline is not None:
         if isinstance(relax_pipeline, str):
-            relax_pipeline = relax.get_pipeline(relax_pipeline)
+            # For GPU targets, prefer the target-specific pipeline which
+            # includes DLight scheduling. Without it, TIR functions generated
+            # from ops like Clip/ReLU6 lack thread bindings and fail
+            # VerifyMemory. CPU targets continue to use the generic pipeline
+            # since the CPU-specific pipeline applies fusion passes that can
+            # incorrectly remove call_pure_packed calls whose results are
+            # unused but whose side effects are relied upon.
+            _is_gpu = target is not None and "gpu" in target.keys
+            if relax_pipeline == "default" and _is_gpu:
+                try:
+                    relax_pipeline = relax.get_default_pipeline(target)
+                except (ValueError, AttributeError):
+                    relax_pipeline = relax.get_pipeline(relax_pipeline)
+            else:
+                relax_pipeline = relax.get_pipeline(relax_pipeline)
         if target is None:
             mod = relax_pipeline(mod)
         else:

(tvm) branch main updated: [BugFix][Relax] Select target-specific pipeline in tvm.compile when GPU target is provided (#19384)

Reply via email to