Re: [dmlc/tvm] [QNN] [RFC] QNN Dialect -- Prequantize Models (#3591)

ds-jnorwood Thu, 25 Jul 2019 11:57:33 -0700

on the discussion of the intermediate formats supported ... 

The intel avx512 vnni/dlboost operations are simd operations that support 8 bit 
quantized inference.


Their intrinsic descriptions show that the int8 multiplies go to intermediate 
16 bit results before they are accumulated (in the fma) to int32 registers.    

Would it be worth considering how tvm can detect sequences that can be 
substituted with these dlboost/vnni simd intrinsic operations?  If tvm 
supported operations with 16 bit accumulators, then you could conceivably 
specify sequences of operations that would exactly match the intrinsic 
descriptions.  Perhaps that would make the intrinsic's pattern easier to detect.

>From the publicity, I think these dlboost/vnni avx512 simd operations are 
>considered important for the new xeon ai inference support.

I'm providing the expanded intrinsic description for one of the ops as an 
example

` 
https://software.intel.com/sites/landingpage/IntrinsicsGuide/#avx512techs=AVX512_VNNI&expand=2202
  `




-- 
You are receiving this because you are subscribed to this thread.
Reply to this email directly or view it on GitHub:
https://github.com/dmlc/tvm/issues/3591#issuecomment-515171346

Re: [dmlc/tvm] [QNN] [RFC] QNN Dialect -- Prequantize Models (#3591)

Reply via email to