On 06/23/2016 03:57 AM, Dominik Vogt wrote:
0002-v3-ChangeLog
gcc/ChangeLog
* explow.c (round_push): Use know adjustment.
(allocate_dynamic_stack_space): Pass known adjustment to round_push.
gcc/testsuite/ChangeLog
* gcc.dg/pr50938.c: New test.
0002-v3-Drop-excess-size-used-for-run-time-allocated-stack-v.patch
>From 4296d353e1d153b5b5ee435a44cae6117bf2fff0 Mon Sep 17 00:00:00 2001
From: Dominik Vogt <v...@linux.vnet.ibm.com>
Date: Fri, 29 Apr 2016 08:36:59 +0100
Subject: [PATCH 2/2] Drop excess size used for run time allocated stack
variables.
The present calculation sometimes led to more stack memory being used than
necessary with alloca. First, (STACK_BOUNDARY -1) would be added to the
allocated size:
size = plus_constant (Pmode, size, extra);
size = force_operand (size, NULL_RTX);
Then round_push was called and added another (STACK_BOUNDARY - 1) before
rounding down to a multiple of STACK_BOUNDARY. On s390x this resulted in
adding 14 before rounding down for "x" in the test case pr36728-1.c.
round_push() now takes an argument to inform it about what has already been
added to size.
---
gcc/explow.c | 45 +++++++++++++++++++++---------------
gcc/testsuite/gcc.dg/pr50938.c | 52 ++++++++++++++++++++++++++++++++++++++++++
2 files changed, 79 insertions(+), 18 deletions(-)
create mode 100644 gcc/testsuite/gcc.dg/pr50938.c
diff --git a/gcc/explow.c b/gcc/explow.c
index 09a0330..85596e2 100644
--- a/gcc/explow.c
+++ b/gcc/explow.c
@@ -949,24 +949,30 @@ anti_adjust_stack (rtx adjust)
}
/* Round the size of a block to be pushed up to the boundary required
- by this machine. SIZE is the desired size, which need not be constant. */
+ by this machine. SIZE is the desired size, which need not be constant.
+ ALREADY_ADDED is the number of units that have already been added to SIZE
+ for other alignment reasons.
+*/
static rtx
-round_push (rtx size)
+round_push (rtx size, int already_added)
{
- rtx align_rtx, alignm1_rtx;
+ rtx align_rtx, add_rtx;
if (!SUPPORTS_STACK_ALIGNMENT
|| crtl->preferred_stack_boundary == MAX_SUPPORTED_STACK_ALIGNMENT)
{
int align = crtl->preferred_stack_boundary / BITS_PER_UNIT;
+ int add;
if (align == 1)
return size;
+ add = (align > already_added) ? align - already_added - 1 : 0;
+
if (CONST_INT_P (size))
{
- HOST_WIDE_INT new_size = (INTVAL (size) + align - 1) / align * align;
+ HOST_WIDE_INT new_size = (INTVAL (size) + add) / align * align;
if (INTVAL (size) != new_size)
size = GEN_INT (new_size);
So presumably the idea here is when the requested SIZE would require
allocating additional space to first see if the necessary space is
already available inside ALREADY_ADDED
Yes.
and use that rather than rounding size up to an alignment boundary.
Not exactly. Consider the unpatched code. At the beginning we
have some amount of space to be allocated on the stack at runtime
("SSIZE"), some requested alignment for it ("SALIGN").
get_dynamic_stack_size() first calculates the space needed for run
time alignment:
SIZE = SSIZE + SALIGN - 1
Then it calls round_push() to add *another* chunk of memory to the
allocation size to be able to align it to the required stack slot
alignment ("SLOTALIGN") at run time.
SIZE = SIZE + SLOTALIGN - 1
= SSIZE + (SALIGN - 1) + (SLOTALIGN - 1)
Now it has added two chunks of memory but alignment is only done
once. With the patch it just adds the maximum of (SALIGN - 1) and
(SLOTALIGN - 1), not both. Thinking about it, the "round_push"
stuff is a very complicated way of saying "add max(A, B)".
Now I see it. Thanks, that helped a ton.
I'd volunteer to clean this up more, but preferrably when the two
pending patches are in. The current code is a real brain-twister.
I'd be all for such cleanups after we wrap up the pending patches. It's
certainly a rats nest of code right now.
This patch is fine for the trunk. Thanks for your patience.
jeff