https://gcc.gnu.org/bugzilla/show_bug.cgi?id=113079
--- Comment #1 from Hongtao Liu <liuhongt at gcc dot gnu.org> --- (In reply to Hongtao Liu from comment #0) > int > foo (int n, unsigned char* p, char* pi) > { > int sum = 0; > for (int i = 0; i != 8; i++) > { > sum += p[i] * pi[i]; > } > return sum; > } > > We can use 128-bit dot_prod instruction + clean upper 64 bits. Currently, clean upper is not needed since it's integral operations, no side effect from upper 64-bits operations.