On 5/13/24 13:28, Jeff Law wrote:
On 5/13/24 12:49 PM, Vineet Gupta wrote:
If the constant used for stack offset can be expressed as sum of two S12
values, the constant need not be materialized (in a reg) and instead the
two S12 bits can be added to instructions involved with frame pointer.
This avoids burning a register and more importantly can often get down
to be 2 insn vs. 3.
The prev patches to generally avoid LUI based const materialization
didn't
fix this PR and need this directed fix in funcion prologue/epilogue
expansion.
This fix doesn't move the neddle for SPEC, at all, but it is still a
win considering gcc generates one insn fewer than llvm for the test ;-)
gcc-13.1 release | gcc 230823 | |
| g6619b3d4c15c | This patch |
clang/llvm
---------------------------------------------------------------------------------
li t0,-4096 | li t0,-4096 | addi sp,sp,-2048 | addi
sp,sp,-2048
addi t0,t0,2016 | addi t0,t0,2032 | add sp,sp,-16 | addi
sp,sp,-32
li a4,4096 | add sp,sp,t0 | add a5,sp,a0 | add
a1,sp,16
add sp,sp,t0 | addi a5,sp,-2032 | sb zero,0(a5) | add
a0,a0,a1
li a5,-4096 | add a0,a5,a0 | addi sp,sp,2032 | sb
zero,0(a0)
addi a4,a4,-2032 | li t0, 4096 | addi sp,sp,32 | addi
sp,sp,2032
add a4,a4,a5 | sb zero,2032(a0) | ret | addi
sp,sp,48
addi a5,sp,16 | addi t0,t0,-2032 | | ret
add a5,a4,a5 | add sp,sp,t0 |
add a0,a5,a0 | ret |
li t0,4096 |
sd a5,8(sp) |
sb zero,2032(a0)|
addi t0,t0,-2016 |
add sp,sp,t0 |
ret |
gcc/ChangeLog:
PR target/105733
* config/riscv/riscv.h: New macros for with aligned offsets.
* config/riscv/riscv.cc (riscv_split_sum_of_two_s12): New
function to split a sum of two s12 values into constituents.
(riscv_expand_prologue): Handle offset being sum of two S12.
(riscv_expand_epilogue): Ditto.
* config/riscv/riscv-protos.h (riscv_split_sum_of_two_s12): New.
gcc/testsuite/ChangeLog:
* gcc.target/riscv/pr105733.c: New Test.
* gcc.target/riscv/rvv/autovec/vls/spill-1.c: Adjust to not
expect LUI 4096.
* gcc.target/riscv/rvv/autovec/vls/spill-2.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-3.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-4.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-5.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-6.c: Ditto.
* gcc.target/riscv/rvv/autovec/vls/spill-7.c: Ditto.
@@ -8074,14 +8111,26 @@ riscv_expand_epilogue (int style)
}
else
{
- if (!SMALL_OPERAND (adjust_offset.to_constant ()))
+ HOST_WIDE_INT adj_off_value = adjust_offset.to_constant ();
+ if (SMALL_OPERAND (adj_off_value))
+ {
+ adjust = GEN_INT (adj_off_value);
+ }
+ else if (SUM_OF_TWO_S12_ALGN (adj_off_value))
+ {
+ HOST_WIDE_INT base, off;
+ riscv_split_sum_of_two_s12 (adj_off_value, &base, &off);
+ insn = gen_add3_insn (stack_pointer_rtx,
hard_frame_pointer_rtx,
+ GEN_INT (base));
+ RTX_FRAME_RELATED_P (insn) = 1;
+ adjust = GEN_INT (off);
+ }
So this was the hunk that we identified internally as causing problems
with libgomp's testsuite. We never fully chased it down as this hunk
didn't seem terribly important performance wise -- we just set it
aside. The thing is it looked basically correct to me. So the
failure was certainly unexpected, but it was consistent.
So I think the question is whether or not the CI system runs the
libgomp testsuite, particularly in the rv64 linux configuration. If it
does, and it passes, then we're good. I'm still finding my way around
the configuration, so I don't know if the CI system Edwin & Patrick
have built tests libgomp or not.
I poked around the .sum files in pre/postcommit and we do run tests like:
PASS: c-c++-common/gomp/affinity-2.c (test for errors, line 45)
I'm not familar with libgomp so I don't know if that's the same libgomp
tests you're referring to.
Patrick
If it isn't run, then we'll need to do a run to test that. I'm set up
here to do that if needed. I can just drop this version into our
internal tree, trigger an internal CI run and see if it complains :-)
If it does complain, then we know where to start investigations.
Jeff