https://gcc.gnu.org/bugzilla/show_bug.cgi?id=82106

--- Comment #7 from Jim Wilson <wilson at gcc dot gnu.org> ---
I have an initial attempt to fix this in the patch I just added as an
attachment.  It needs more work and more testing to be useful, and agreement
from other gcc hackers that this makes sense.

On the original testcase, without the patch we get
        sw      a7,12(sp)
        fld     fa0,12(sp)
and with the patch we get
        lw      a5,16(sp)
        sw      a7,8(sp)
        sw      a5,12(sp)
        fld     fa0,8(sp)
which is larger, but avoids the unaligned load, and hence may be faster is
unaligned loads trap.

On the alternate testcase, without the patch we get
        sw      a7,12(sp)
        lw      a0,12(sp)
        lw      a1,16(sp)
and with the patch we get
        lw      a1,0(sp)
        mv      a0,a7
which is smaller and faster.

Reply via email to