coffezhou opened a new issue, #18603: URL: https://github.com/apache/tvm/issues/18603
### Expected behavior TVM should compile the model correctly. ### Actual behavior For the following model, <img width="574" height="491" alt="Image" src="https://github.com/user-attachments/assets/9f132764-7cb6-45ac-a06c-6f165156a1e1" /> TVM crashes: ``` Traceback (most recent call last): File "/home/ubuntu/Documents/test.py", line 45, in test ex = tvm.compile(tvm_model, target="llvm") ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/Documents/DLCompilers/tvm/python/tvm/driver/build_module.py", line 104, in compile return tvm.relax.build( ^^^^^^^^^^^^^^^^ File "/home/ubuntu/Documents/DLCompilers/tvm/python/tvm/relax/vm_build.py", line 263, in build return _vmlink( ^^^^^^^^ File "/home/ubuntu/Documents/DLCompilers/tvm/python/tvm/relax/vm_build.py", line 158, in _vmlink lib = tvm.tir.build(tir_mod, target=target, pipeline=tir_pipeline) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/Documents/DLCompilers/tvm/python/tvm/tir/build.py", line 239, in build return tir_to_runtime(host_mod, device_mod_dict, target_host) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/Documents/DLCompilers/tvm/python/tvm/tir/build.py", line 149, in tir_to_runtime mhost = codegen_build(mhost_all, target_host) ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ File "/home/ubuntu/Documents/DLCompilers/tvm/python/tvm/tir/build.py", line 131, in codegen_build return bf(mod, target) ^^^^^^^^^^^^^^^ File "python/tvm_ffi/cython/function.pxi", line 904, in tvm_ffi.core.Function.__call__ File "<unknown>", line 0, in tvm::codegen::LLVMModuleNode::Init(tvm::IRModule const&, tvm::Target const&) File "<unknown>", line 0, in tvm::codegen::CodeGenCPU::Finish() File "<unknown>", line 0, in tvm::codegen::CodeGenLLVM::Finish() File "<unknown>", line 0, in tvm::codegen::CodeGenLLVM::Verify() const File "<unknown>", line 0, in tvm::runtime::detail::LogFatal::Entry::Finalize() tvm.error.InternalError: LLVM module verification failed with the following errors: location of #dbg_declare must be a pointer or int #dbg_declare(float %v_input_red_temp.v0, !205, !DIExpression(), !174) float %v_input_red_temp.v0 label %if_end ptr @layer_norm_compute_ location of #dbg_declare must be a pointer or int #dbg_declare(float %v_input_red_temp.v0, !205, !DIExpression(), !174) float %v_input_red_temp.v0 label %if_end ptr @layer_norm_compute_ location of #dbg_declare must be a pointer or int #dbg_declare(float %v_input_red_temp.v1, !206, !DIExpression(), !174) float %v_input_red_temp.v1 label %if_end ptr @layer_norm_compute_ location of #dbg_declare must be a pointer or int #dbg_declare(float %v_input_red_temp.v1, !206, !DIExpression(), !174) float %v_input_red_temp.v1 label %if_end ptr @layer_norm_compute_ ``` I have reported two similar issues, #18602 and #18595. I am not sure that these issues are duplicate since they are for different operators. I build TVM using the same conda [environment](https://tvm.apache.org/docs/install/from_source.html): ``` # make sure to start with a fresh environment conda env remove -n tvm-build-venv # create the conda environment with build dependency conda create -n tvm-build-venv -c conda-forge \ "llvmdev>=15" \ "cmake>=3.24" \ git \ python=3.11 # enter the build environment conda activate tvm-build-venv ``` Maybe there are some defects for TVM or LLVM. Before these issues are resolved, I will never report new issue like them. ### Environment OS: Ubuntu 20.04 TVM: 0.23.dev0 (https://github.com/apache/tvm/commit/f4e28d3153323ad97a7e74740c9fb22300fd6cd0) onnxruntime: 1.23.2 ### Steps to reproduce This bug can be reproduced by the following code with the model in the attachment. As shown in the code, the model can be executed by onnxruntime. ```python from typing import Dict, List, Literal, Optional import sys import os import numpy as np import onnx import onnxruntime from onnx import ModelProto, TensorProto, helper import tvm import tvm.testing from tvm import relax from tvm.relax.frontend.onnx import from_onnx import argparse import pickle def test(model: ModelProto,) -> None: model.ir_version = 8 model.opset_import[0].version = 22 with open("inputs.pkl", 'rb') as fp: inputs = pickle.load(fp) # Run the model through onnx to get the expected result. try: ort_session = onnxruntime.InferenceSession( model.SerializeToString(), providers=["CPUExecutionProvider"] ) ort_output = ort_session.run([], inputs) except Exception as e: print("This model cannot be executed by onnxruntime!") sys.exit(1) # Convert the onnx model into relax through the onnx importer. tvm_model = from_onnx(model, opset=22, keep_params_in_input=True) # Convert operators for inference mode. tvm_model = relax.transform.DecomposeOpsForInference()(tvm_model) # Legalize any relax ops into tensorir. tvm_model = relax.transform.LegalizeOps()(tvm_model) # Separate model from parameters. tvm_model, params = relax.frontend.detach_params(tvm_model) # Compile the relax graph into a VM then run. with tvm.transform.PassContext(opt_level=3): ex = tvm.compile(tvm_model, target="llvm") if __name__ == "__main__": onnx_model = onnx.load("11.onnx") test(onnx_model) ``` [testcase.zip](https://github.com/user-attachments/files/24306486/testcase.zip) ### Triage Please refer to the list of label tags [here](https://github.com/apache/tvm/wiki/Issue-Triage-Labels) to find the relevant tags and add them below in a bullet format (example below). * needs-triage -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected] --------------------------------------------------------------------- To unsubscribe, e-mail: [email protected] For additional commands, e-mail: [email protected]
