Thanks for the great work! I have some quick question: 1. Have you tested various models arm cpu? (like A53, A72, A55, A75 and so on). According to fb qnnpack blog, it is not always could get best performance using `umul` compared with `smlal` instruction (used by now). (https://engineering.fb.com/ml-applications/qnnpack/). So just change legalization and give it up `smlal` instruction in aarch64 maybe doesn't make sense to me. One proof: our coming feature `Ansor` (auto scheduler) doesn't support tensorize (at least till now), however, it could get nice performance using `smlal` instruction and beyond TFLite 1.2X on mobilenet v2 quantized model (cortex-a53) (https://discuss.tvm.ai/t/tflite-and-tvm-comparison-for-quantized-models/6577/4). I mean here: ```python @qnn_conv2d_legalize.register('arm_cpu') def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types): # ARM prefers the dtypes to be same. if is_aarch64_arm(): return helper_change_dtypes_to_be_same(attrs, inputs, types, relay.qnn.op.conv2d) return helper_no_fast_int8_hw_legalization(attrs, inputs, types, relay.qnn.op.conv2d) ``` It disables us using `SMLAL` instruction.
2. I suggest we keep two schedules (tensorize and default spatial pack). Not just check `aarch64` and only use tensorize template. I mean here: ```python is_aarch64 = "aarch64" in str(isa.target) if is_aarch64 and data.dtype in ["int8", "uint8"]: strategy.add_implementation( wrap_compute_conv2d(topi.arm_cpu.compute_conv2d_NHWC_quantized), wrap_topi_schedule(topi.arm_cpu.schedule_conv2d_NHWC_quantized), name="compute_conv2d_NHWC_quantized.arm_cpu") else: strategy.add_implementation( wrap_compute_conv2d(topi.arm_cpu.conv2d_nhwc_spatial_pack), wrap_topi_schedule(topi.arm_cpu.schedule_conv2d_nhwc_spatial_pack), name="conv2d_nhwc_spatial_pack.arm_cpu") ``` This is our design purpose of strategy. I suspect there is some workload our spatial pack could perform better. This situation is the same as Winograd, we could perform winograd and default template and choose better. -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/apache/incubator-tvm/pull/5754#issuecomment-642374967