I was trying to define argmax using **comm_reducer**, and I couldn't find any
way to define a comm_reducer that **just computes the index of the maximum
value**, and does not compute the value as well.
For example, Taking the comm_reducer for argmax as defined in
[`tests/python/integration/t
I using prequantized tutorial on
https://tvm.apache.org/docs/tutorials/frontend/deploy_prequantized.html#sphx-glr-tutorials-frontend-deploy-prequantized-py.
And tune it with kernel tuning, and seems graph tuning is not available for
qnn op? The model after tuning perform a worse performance, i
I'm not sure about executor class.
However, this method might work on your requirement too. (never run it, but
from my guess it should work too)
```
mod, params = relay.frontend.from_onnx(onnx_model, shape_dict)
func = mod["main"]
target = "llvm"
with tvm.transform.PassContext(opt_level=3):