> Logically, implementation of Vmull.s32 and vmul.u32 is just similar to the > 8 and 16 bit cases. For example: > case 4: gen_helper_neon_mull_s32(dest, a, b); break; > case 5: gen_helper_neon_mull_u32(dest, a, b); break; > I implemented in this way and tested. It is OK. So I can't understand why > Vmull.s32 and vmul.u32 were implemented like this in QEMU 0.12.5. Please > explain for me !
I think you're asking the wrong question. Instead ask yourself why should we add a new helper when we already know how to do 32x32->64 multiplies. Paul