https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103109
Bug ID: 103109 Summary: madd not used for multiply add on POWER9 Product: gcc Version: 12.0 Status: UNCONFIRMED Severity: enhancement Priority: P3 Component: target Assignee: unassigned at gcc dot gnu.org Reporter: tkoenig at gcc dot gnu.org Target Milestone: --- The following code #include <stdint.h> void Long_multiplication( uint64_t multiplicand[], uint64_t multiplier[], uint64_t sum[], uint64_t ilength, uint64_t jlength ) { uint64_t acarry, mcarry, product; for( uint64_t i = 0; i < (ilength + jlength); i++ ) sum[i] = 0; acarry = 0; for( uint64_t j = 0; j < jlength; j++ ) { mcarry = 0; for( uint64_t i = 0; i < ilength; i++ ) { __uint128_t mcarry_prod; __uint128_t acarry_sum; mcarry_prod = ((__uint128_t) multiplicand[i]) * ((__uint128_t) multiplier[j]) + (__uint128_t) mcarry; mcarry = mcarry_prod >> 64; product = mcarry_prod; acarry_sum = ((__uint128_t) sum[i+j]) + ((__uint128_t) acarry) + product; sum[i+j] += acarry_sum; acarry = acarry_sum >> 64; // {mcarry, product} = multiplicand[i]*multiplier[j] // + mcarry; // {acarry,sum[i+j]} = {sum[i+j]+acarry} + product; } } } is translated by $ gcc -mcpu=power9 -mtune=power9 -S -O3 big_int.c to (assembler output of the loop) .L4: mtctr 25 mr 12,23 add 3,24,4 li 5,0 .p2align 4,,15 .L5: ldu 10,8(12) ldx 11,29,4 ldu 9,8(3) mulld 8,10,11 mulhdu 10,10,11 addc 30,8,5 addze 31,10 and 21,30,6 and 22,31,7 addc 10,21,9 mr 5,31 adde 8,22,28 addc 10,10,0 add 9,9,10 addze 0,8 std 9,0(3) bdnz .L5 addi 27,27,1 addi 4,4,8 cmpld 0,26,27 bne 0,.L4 For the idiom to calculate mcarry_prod, I would have expected a pair of maddhdu and maddld instructions. This is with gcc-Version 12.0.0 20211028 (experimental) (GCC)