Thanks for the great work! I have some quick question:
1. Have you tested various models arm cpu? (like A53, A72, A55, A75 and so on).
According to fb qnnpack blog, it is not always could get best performance using
`umul` compared with `smlal` instruction (used by now).
(https://engineering.fb.com/ml-applications/qnnpack/). So just change
legalization and give it up `smlal` instruction in aarch64 maybe doesn't make
sense to me. One proof: our coming feature `Ansor` (auto scheduler) doesn't
support tensorize (at least till now), however, it could get nice performance
using `smlal` instruction and beyond TFLite 1.2X on mobilenet v2 quantized
model (cortex-a53)
(https://discuss.tvm.ai/t/tflite-and-tvm-comparison-for-quantized-models/6577/4).
I mean here:
```python
@qnn_conv2d_legalize.register('arm_cpu')
def _qnn_conv2d_legalize_arm_cpu(attrs, inputs, types):
# ARM prefers the dtypes to be same.
if is_aarch64_arm():
return helper_change_dtypes_to_be_same(attrs, inputs, types,
relay.qnn.op.conv2d)
return helper_no_fast_int8_hw_legalization(attrs, inputs, types,
relay.qnn.op.conv2d)
```
It disables us using `SMLAL` instruction.
2. I suggest we keep two schedules (tensorize and default spatial pack). Not
just check `aarch64` and only use tensorize template. I mean here:
```python
is_aarch64 = "aarch64" in str(isa.target)
if is_aarch64 and data.dtype in ["int8", "uint8"]:
strategy.add_implementation(
wrap_compute_conv2d(topi.arm_cpu.compute_conv2d_NHWC_quantized),
wrap_topi_schedule(topi.arm_cpu.schedule_conv2d_NHWC_quantized),
name="compute_conv2d_NHWC_quantized.arm_cpu")
else:
strategy.add_implementation(
wrap_compute_conv2d(topi.arm_cpu.conv2d_nhwc_spatial_pack),
wrap_topi_schedule(topi.arm_cpu.schedule_conv2d_nhwc_spatial_pack),
name="conv2d_nhwc_spatial_pack.arm_cpu")
```
This is our design purpose of strategy. I suspect there is some workload our
spatial pack could perform better. This situation is the same as Winograd, we
could perform winograd and default template and choose better.
--
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/apache/incubator-tvm/pull/5754#issuecomment-642374967