[Apache TVM Discuss] [Questions] TVM use Tensorrt is so slowly

WJH via Apache TVM Discuss Wed, 16 Jul 2025 19:17:36 -0700


builder_ = nvinfer1::createInferBuilder(*logger)
this function is very time-consuming and has been called more than once.
Does the this  call every time there is a subgraph?
this function is nvidia api.


here is my code

model_path = "models/yolov5s.v5.onnx"

logging.basicConfig(level=logging.DEBUG)

onnx_model = onnx.load(model_path)


BATCH_SIZE  = 1

input_shape = (BATCH_SIZE, 3, 640, 640)

input_name = "images"
dtype="float16"

shape_dict = {input_name: input_shape}
mod, params = relay.frontend.from_onnx(onnx_model, shape_dict,dtype=dtype)

mod = relay.transform.InferType()(mod)

mod = tensorrt.partition_for_tensorrt(mod)

with tvm.transform.PassContext(opt_level=3):
    lib = relay.build(mod,target="cuda",params=params)

dev = tvm.cuda(0)

module_exec = graph_executor.GraphModule(lib["default"](dev))

x_data = np.random.uniform(-1, 1, input_shape).astype(dtype)

module_exec.set_input(input_name, x_data)

print(module_exec.benchmark(dev,number=1,repeat=1))





---
[Visit 
Topic](https://discuss.tvm.apache.org/t/tvm-use-tensorrt-is-so-slowly/18472/1) 
to respond.

You are receiving this because you enabled mailing list mode.

To unsubscribe from these emails, [click 
here](https://discuss.tvm.apache.org/email/unsubscribe/57f55964d187bc4c73d7a27fbe110616470fa5e49a8079993e65f92741f2072e).

[Apache TVM Discuss] [Questions] TVM use Tensorrt is so slowly

Reply via email to