Riscv code generation

Jacob Navia via Gcc Mon, 23 Oct 2023 07:15:41 -0700

Hi
In a previous post I pointed to a strange code generation`by gcc in the 
riscv-64 targets.
To resume:
        Suppose a 64 bit operation: c = a OP b;
Gcc does the following:
        Instead of loading 64 bits from memory gcc loads 8 bytes into 8 
separate registers for both operands. Then it ORs the 8 bytes into a single 64 
bit number. Then, it executes the 64 bit operation. And lastly, it splits the 
64 bits result into 8 bytes into 8 different registers, and stores this 8 bytes 
one after the other.


When I saw this I was impressed that that utterly bloated code did run faster 
than a hastyly written assembly program I did in 10 minutes. Obviously I didn’t 
take any pipeline turbulence into account and my program was slower. When I did 
take pipeline turbulence into account, I managed to write a program that runs 
several times faster than the bloated code.

You realize that for the example above, instead of
1) Load 64 bits into a register (2 operations)
2) Do the operation
3) Store the result

We have 2 loads, and 1 operation + a store. 4 instructions compared to 46 
operations for the « gcc way » (16 loads of a byte, 14 x 2  OR operations and 8 
shifts to split the result and 8 stores of a byte each.

I think this is a BUG, but I’m still not convinced that it is one,  and I do 
not have a clue WHY you do this.

Is here anyone doing the riscv backend? This happens only with -O3 by the way

Sample code:

#define ACCUM_MENGTH 9
#define WORDSIZE 64
Typedef struct {
   Int sign, exponent;
   Long long mantissa[ACCUM_LENGTH];
} QfloatAccum,*QfloatAccump;

void shup1(QfloatAccump x)
{
        QELT newbits,bits;
        int i;
        bits = x->mantissa[ACCUM_LENGTH] >> (WORDSIZE-1);
        x->mantissa[ACCUM_LENGTH] <<= 1;
        for( i=ACCUM_LENGTH-1; i>0; i-- ) {
                newbits = x->mantissa[i] >> (WORDSIZE - 1);
                x->mantissa[i] <<= 1;
                x->mantissa[i] |= bits;
                bits = newbits;
        }
        x->mantissa[0] <<= 1;
        x->mantissa[0] |= bits;
}

Please point me to the right person. Thanks

Riscv code generation

Reply via email to