gcc-13-20230805 is now available

2023-08-05 Thread GCC Administrator via Gcc
Snapshot gcc-13-20230805 is now available on
  https://gcc.gnu.org/pub/gcc/snapshots/13-20230805/
and on various mirrors, see http://gcc.gnu.org/mirrors.html for details.

This snapshot has been generated from the GCC 13 git branch
with the following options: git://gcc.gnu.org/git/gcc.git branch 
releases/gcc-13 revision e15cd6f0d8143e509d2d45f5bfa4d0445b7fff6b

You'll find:

 gcc-13-20230805.tar.xz   Complete GCC

  SHA256=d27809feab0b826fea5f9a0632e9428be0d6542b07760847d1d2a0881d985902
  SHA1=d9738e496cfcf86a8e9d73d7cff7349828ddb341

Diffs from 13-20230729 are available in the diffs/ subdirectory.

When a particular snapshot is ready for public consumption the LATEST-13
link is updated and a message is sent to the gcc list.  Please do not use
a snapshot before it has been announced that way.


Stack alignment on modern 32 bit bare metal ARMs?

2023-08-05 Thread Barrie Slaymaker via Gcc
Hi,

I'm cross compiling for 32 bit bare metal ARMs (modern ones: Cortex-M4 and
Cortex M-33) w/ gcc 12.3.0, which is the latest available from ARM, (see
gcc -v output below) and have found that va_arg(..., double) (i.e.
__builtin_va_arg()) assumes that doubles are 64-bit aligned, but the stack
is not always so.

I searched the bug database but didn't see this, so I'm guessing this isn't
a GCC bug--the ARM world would be on fire if it were. And I've searched the
gcc command line options docs, and the ARM architecture docs to no avail.
I'm hoping I didn't miss something obvious...

So, does gcc assume or require that doubles on the stack be 64-bit aligned,
or is there an option we should be passing to either allow 32-bit alignment
or force 64-bit alignment, or is the MCU vendor's startup code a wee buggy
(this is what I suspect, but wanted to be damn sure before continuing)?

Here's the test code:

void va_args_test(int i, ...) {
va_list args;
va_start(args, i);
double d = (int)va_arg(args, double);
va_end(args);
// display code elided
}

Here's the generated assembly, with commentary mine:

void va_args_test(int i, ...) {
3f60:→  b40f  → push→   {r0, r1, r2, r3}
3f62:→  b580  → push→   {r7, lr}
3f64:→  b082  → sub→sp, #8
3f66:→  af00  → add→r7, sp, #0

va_list args;
3f68:→  2300  → movs→   r3, #0
3f6a:→  607b  → str→r3, [r7, #4]

va_start(args, i);
3f6c:→  f107 0314 → add.w→  r3, r7, #20
3f70:→  607b  → str→r3, [r7, #4]

double d = (int)va_arg(args, double);
3f72:→  f107 031b → add.w→  r3, r7, #27   ; Loads the address of the
last byte of the low order word into r3.
3f76:→  f023 0307 → bic.w→  r3, r3, #7; Clears the low 3 bits,
which works when the double is 64-bit aligned. Not so much otherwise.
3f7a:→  f103 0208 → add.w→  r2, r3, #8; Increments args' internal
pointer
3f7e:→  607a  → str→r2, [r7, #4]  ; Saves that pointer
3f80:→  e9d3 0100 → ldrd→   r0, r1, [r3]  ; Reads the double, right or
wrong...

Here's the call site assembly:

va_args_test(0, (double)1.0);
3fc2:→  2200  → movs→   r2, #0
3fc4:→  4b09  → ldr→r3, [pc, #36]→  ; (3fec )
3fc6:→  2000  → movs→   r0, #0
3fc8:→  4909  → ldr→r1, [pc, #36]→  ; (3ff0 )
3fca:→  4788  → blx→r1

This is using GCC 12.3.0, cross-compiling for ARM on x86_64 (gcc -v output
below sig), with a command line like

arm-none-eabi-gcc -o ../build/main/PAC5524/tmp/base/src/main.o
base/src/main.c <<-I options elided>>> -mcpu=cortex-m4 -march=armv7e-m
-mfpu=fpv4-sp-d16 -std=gnu99 -ffunction-sections -fno-omit-frame-pointer
-fno-strict-overflow -fsingle-precision-constant
-ftrivial-auto-var-init=zero -mthumb -mlittle-endian -mlong-calls
-mfloat-abi=hard -Og -c -MD -MP

Removing any one of the -f options happens to align the stack correctly in
most cases (I've elided the -f options that don't affect this issue as far
as I can tell).

Many thanks,

Barrie

gcc -v output:

Using built-in specs.
COLLECT_GCC=arm-none-eabi-gcc
COLLECT_LTO_WRAPPER=/usr/share/arm-gnu-toolchain-12.3.rel1-x86_64-arm-none-eabi/bin/../libexec/gcc/arm-none-eabi/12.3.1/lto-wrapper
Target: arm-none-eabi
Configured with:
/data/jenkins/workspace/GNU-toolchain/arm-12/src/gcc/configure
--target=arm-none-eabi
--prefix=/data/jenkins/workspace/GNU-toolchain/arm-12/build-arm-none-eabi/install
--with-gmp=/data/jenkins/workspace/GNU-toolchain/arm-12/build-arm-none-eabi/host-tools
--with-mpfr=/data/jenkins/workspace/GNU-toolchai
n/arm-12/build-arm-none-eabi/host-tools
--with-mpc=/data/jenkins/workspace/GNU-toolchain/arm-12/build-arm-none-eabi/host-tools
--with-isl=/data/jenkins/workspace/GNU-toolchain/arm-12/build-arm-none-eabi/host-tools
--disable-shared --disable-nls --disable-threads --disable-tls
--enable-checking=release --enable-language
s=c,c++,fortran --with-newlib --with-gnu-as --with-gnu-ld
--with-sysroot=/data/jenkins/workspace/GNU-toolchain/arm-12/build-arm-none-eabi/install/arm-none-eabi
--with-multilib-list=aprofile,rmprofile --with-pkgversion='Arm GNU
Toolchain 12.3.Rel1 (Build arm-12.35)' --with-bugurl=
https://bugs.linaro.org/
Thread model: single
Supported LTO compression algorithms: zlib
gcc version 12.3.1 20230626 (Arm GNU Toolchain 12.3.Rel1 (Build arm-12.35))

Test code (the LED lights very prettily when va_arg() returns the correct
value):

void va_args_test(int i, ...) {
va_list args;
va_start(args, i);
i = (int)va_arg(args, double);
va_end(args);
bal_init();
bal_set_AUX_LED1(i == 1);
}

int main(void) {
   ...CPU initialization elided...
va_args_test(0, (double)1.0);
while (true) {
}
}