This is Jakub's patch and Ian's testcase for the slightly vexing fault
building the D runtime with an s390x-x-riscv cross compiler.
The core issue is we're allocating a vector to hold temporary registers
unconditionally, including cases where the vector isn't needed because
the loop isn't going to iterate.
In the cases where the vector isn't needed the length is computed with
an expression (x / y) - 1 where x / y will be zero. The alloca(-1) on
the s390 platform triggers a fault. We haven't seen the fault with an
x86 cross, but we can certainly see the bogus value being passed to
alloca with a debugger.
Jakub patch just conditionalizes the whole block in a sensible way. So
it looks larger than it really is. I thought it might be better to do a
bit of manual CSE on this code to make it even more obvious, but I think
we're ultimately OK here.
Ian provided the testcase, collapsed down into equivalent C code.
Again, it doesn't fault on an x86-x-riscv, but I can see the incorrect
behavior with a debugger.
And a shout-out to Stefan for providing a docker based reproducer, it
really helped track this down.
Waiting for the pre-commit tester to do its thing before committing.
Jeff
PR target/118248
gcc/
* config/riscv/riscv-string.cc (riscv_block_move_straight): Only
allocate REGS buffer if it will be needed.
gcc/testsuite
* gcc.target/riscv/pr118248.c: New test.
diff --git a/gcc/config/riscv/riscv-string.cc b/gcc/config/riscv/riscv-string.cc
index 408eb07e87f..90801899ec8 100644
--- a/gcc/config/riscv/riscv-string.cc
+++ b/gcc/config/riscv/riscv-string.cc
@@ -804,7 +804,7 @@ static void
riscv_block_move_straight (rtx dest, rtx src, unsigned HOST_WIDE_INT length,
unsigned HOST_WIDE_INT align)
{
- unsigned HOST_WIDE_INT offset, delta;
+ unsigned HOST_WIDE_INT offset = 0, delta;
unsigned HOST_WIDE_INT bits;
int i;
enum machine_mode mode;
@@ -815,20 +815,25 @@ riscv_block_move_straight (rtx dest, rtx src, unsigned
HOST_WIDE_INT length,
mode = mode_for_size (bits, MODE_INT, 0).require ();
delta = bits / BITS_PER_UNIT;
- /* Allocate a buffer for the temporary registers. */
- regs = XALLOCAVEC (rtx, length / delta - 1);
-
- /* Load as many BITS-sized chunks as possible. Use a normal load if
- the source has enough alignment, otherwise use left/right pairs. */
- for (offset = 0, i = 0; offset + 2 * delta <= length; offset += delta, i++)
+ if (2 * delta <= length)
{
- regs[i] = gen_reg_rtx (mode);
- riscv_emit_move (regs[i], adjust_address (src, mode, offset));
- }
+ /* Allocate a buffer for the temporary registers. */
+ regs = XALLOCAVEC (rtx, length / delta - 1);
+
+ /* Load as many BITS-sized chunks as possible. Use a normal load if
+ the source has enough alignment, otherwise use left/right pairs. */
+ for (offset = 0, i = 0; offset + 2 * delta <= length;
+ offset += delta, i++)
+ {
+ regs[i] = gen_reg_rtx (mode);
+ riscv_emit_move (regs[i], adjust_address (src, mode, offset));
+ }
- /* Copy the chunks to the destination. */
- for (offset = 0, i = 0; offset + 2 * delta <= length; offset += delta, i++)
- riscv_emit_move (adjust_address (dest, mode, offset), regs[i]);
+ /* Copy the chunks to the destination. */
+ for (offset = 0, i = 0; offset + 2 * delta <= length;
+ offset += delta, i++)
+ riscv_emit_move (adjust_address (dest, mode, offset), regs[i]);
+ }
/* Mop up any left-over bytes. */
if (offset < length)
diff --git a/gcc/testsuite/gcc.target/riscv/pr118248.c
b/gcc/testsuite/gcc.target/riscv/pr118248.c
new file mode 100644
index 00000000000..05bf002abac
--- /dev/null
+++ b/gcc/testsuite/gcc.target/riscv/pr118248.c
@@ -0,0 +1,26 @@
+/* This does not fault with an x86 cross to risc-v, but does
+ with s390 cross to risc-v, probably due to rounding of the
+ argument to the alloca call within the risc-v backend. */
+/* { dg-do compile } */
+struct char100
+{
+ char data[100];
+};
+
+struct s118248
+{
+ void **vtbl;
+ struct char100 data;
+};
+
+void sink(struct char100 *buf);
+
+struct s118248 *pr118248(struct s118248 *pthis)
+{
+ struct char100 buf;
+ sink(&buf);
+ pthis->data = buf;
+ return pthis;
+}
+
+