https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94173
Jim Wilson <wilson at gcc dot gnu.org> changed: What |Removed |Added ---------------------------------------------------------------------------- CC| |wilson at gcc dot gnu.org --- Comment #1 from Jim Wilson <wilson at gcc dot gnu.org> --- struct Pair has size 8 and align 4, and we have no unaligned load/store support, so we are not able to allocate the temporary local variable to a register. It must be allocated a stack slot. The RTL optimizer is able to figure out that the stack stores and loads don't alias anything and hence are not necessary and optimizes them away. However, we don't have any support to unallocate a stack slot after it has already been allocated, so we end up with the unnecessary stack pointer increment and decrement. In a degenerate case like this, where there are no longer any stack loads/stores, we may be able to notice that and get rid of the stack pointer manipulation. But in a more complicated case where there are multiple stack slots, and references to all but one is optimized away, then we would still need the stack pointer change, though we would just be wasting stack space in this case with larger decrements/increments than needed. If you change the type to truct Pair { char *s; char *t; } __attribute__ ((aligned(8))); then you get the result you want. That isn't a practical solution, but it demonstrates that this is a size/alignment/strict-alignment problem. This is more of a middle end problem than a RISC-V backend problem. It should be possible to reproduce on any target with similar strict alignment constraints, and similar calling conventions that allow returning the structure in registers, though I don't know if there are any offhand.