https://gcc.gnu.org/bugzilla/show_bug.cgi?id=93141
--- Comment #7 from Madhur Chauhan <madhur4127 at gmail dot com> --- As far as I can tell optimal asm generated should be like: mov-load from on array mul or preferably mulx with a memory source from the other array add + adc into 128-bit answer register adc reg, 0 to accumulate the carry-out. This optimization could help many big integer libraries as this lies in most critical section.