on the discussion of the intermediate formats supported ... The intel avx512 vnni/dlboost operations are simd operations that support 8 bit quantized inference.
Their intrinsic descriptions show that the int8 multiplies go to intermediate 16 bit results before they are accumulated (in the fma) to int32 registers. Would it be worth considering how tvm can detect sequences that can be substituted with these dlboost/vnni simd intrinsic operations? If tvm supported operations with 16 bit accumulators, then you could conceivably specify sequences of operations that would exactly match the intrinsic descriptions. Perhaps that would make the intrinsic's pattern easier to detect. >From the publicity, I think these dlboost/vnni avx512 simd operations are >considered important for the new xeon ai inference support. I'm providing the expanded intrinsic description for one of the ops as an example ` https://software.intel.com/sites/landingpage/IntrinsicsGuide/#avx512techs=AVX512_VNNI&expand=2202 ` -- You are receiving this because you are subscribed to this thread. Reply to this email directly or view it on GitHub: https://github.com/dmlc/tvm/issues/3591#issuecomment-515171346