Keith Packard <kei...@keithp.com> writes: >> insn_propagation would previously only replace (reg:M H) with X >> for some hard register H if the uses of H were also in mode M. >> This patch extends it to handle simple mode punning too. > > (this is from GCC commit 9d20529d94b23275885f380d155fe8671ab5353a) > > Looks like this breaks m68k floating point conversion handling. Recall > that m68k has only 80-bit float hardware with regular numbered registers > (unlike 8087). Operations are done in 80-bit format, the FPU provides > fmove.d and fmove.s operations to convert between 80-bit and > 32-bit/64-bit values in memory. > > float naf(float, float); > int prf(char *, ...); > float sqf(float); > int func(void); > > int func(void) > { > float x = naf(4.0f, 5.0f); > prf("testing sqrtf for value %a\n", (double) x); > float y = sqf(x); > prf("sqrtf is %a\n", (double) y); > return 0; > } > > In this example, the return value from naf is passed as a double to prf > and also passed as a float to sqf. The value is saved on the stack > across the prf call. > > In the old code, it was restored to %fp0 and then pushed as a float on > the stack to sqf: > > lea prf,%a2 > fmove.x %fp0,24(%sp) > jsr (%a2) > fmove.x 24(%sp),%fp0 > fmove.s %fp0,-(%sp) > jsr sqf > > With this patch, the low 32 bits of the value (which is garbage) gets > pushed on the stack instead: > > lea prf,%a2 > fmove.x %fp0,24(%sp) > jsr (%a2) > move.l 32(%sp),-(%sp) > jsr sqf > > Here's the broken code: > > rtx newval = to; > if (GET_MODE (x) != GET_MODE (from)) > { > gcc_assert (REG_P (x) && HARD_REGISTER_P (x)); > if (REG_NREGS (x) != REG_NREGS (from) > || !REG_CAN_CHANGE_MODE_P (REGNO (x), GET_MODE (from), > GET_MODE (x))) > return false; > newval = simplify_subreg (GET_MODE (x), to, GET_MODE (from), > subreg_lowpart_offset (GET_MODE (x), > GET_MODE (from))); > if (!newval) > return false; > } > > The trouble is that REG_CAN_CHANGE_MODE_P(%fp0, E_XFmode, E_SFmode) > returns true, and then simplify_subreg goes and extracts the low 32-bits > of the 80-bit value directly, skipping the conversion. This is the same > as i386, so let's adapt the i386 version of can_change_mode_class. See > attached patch, which seems to fix the bug for me. > > From 862467c578545a4a7ed83767df2a181e7d845bc6 Mon Sep 17 00:00:00 2001 > From: Keith Packard <kei...@keithp.com> > Date: Sat, 28 Dec 2024 23:38:31 -0800 > Subject: [PATCH] m68k: TARGET_CAN_CHANGE_MODE_CLASS is false for FP regs > > Just like i386, you cannot extract float or double values by using > the low bits of m68k float registers. Override this hook and check > that case. > > Signed-off-by: Keith Packard <kei...@keithp.com>
Thanks, this looks like the right fix to me. I'm not sure off-hand whether it's worth continuing to allow subregs between (say) SF and CSF. If so... > --- > gcc/config/m68k/m68k.cc | 22 ++++++++++++++++++++++ > 1 file changed, 22 insertions(+) > > diff --git a/gcc/config/m68k/m68k.cc b/gcc/config/m68k/m68k.cc > index 050ac096e55..54bcfaffa72 100644 > --- a/gcc/config/m68k/m68k.cc > +++ b/gcc/config/m68k/m68k.cc > @@ -200,6 +200,7 @@ static void m68k_asm_final_postscan_insn (FILE *, > rtx_insn *insn, rtx [], int); > static HARD_REG_SET m68k_zero_call_used_regs (HARD_REG_SET); > static machine_mode m68k_c_mode_for_floating_type (enum tree_index); > static bool m68k_use_lra_p (void); > +static bool m68k_can_change_mode_class (machine_mode from, machine_mode to, > reg_class_t rclass); > > /* Initialize the GCC target structure. */ > > @@ -370,6 +371,9 @@ static bool m68k_use_lra_p (void); > #undef TARGET_C_MODE_FOR_FLOATING_TYPE > #define TARGET_C_MODE_FOR_FLOATING_TYPE m68k_c_mode_for_floating_type > > +#undef TARGET_CAN_CHANGE_MODE_CLASS > +#define TARGET_CAN_CHANGE_MODE_CLASS m68k_can_change_mode_class > + > TARGET_GNU_ATTRIBUTES (m68k_attribute_table, > { > /* { name, min_len, max_len, decl_req, type_req, fn_type_req, > @@ -7269,4 +7273,22 @@ m68k_use_lra_p () > return m68k_lra_p; > } > > +/* Implement TARGET_CAN_CHANGE_MODE_CLASS. */ > + > +static bool > +m68k_can_change_mode_class (machine_mode from, > + machine_mode to, > + reg_class_t rclass) > +{ > + if (from == to) > + return true; > + > + /* 68881 registers can't do subreg at all, as all values are reformatted > + to extended precision. */ > + if (reg_classes_intersect_p(rclass, FP_REGS)) > + return false; ...that could be handled by adding: && GET_MODE_INNER (from) != GET_MODE_INNER (to) to the return false condition. Andreas (cc:ed) would know better tham me though. Given the restriction you describe, I was surprised to see that m68k defines MODES_TIEABLE_P as: static bool m68k_modes_tieable_p (machine_mode mode1, machine_mode mode2) { return (!TARGET_HARD_FLOAT || ((GET_MODE_CLASS (mode1) == MODE_FLOAT || GET_MODE_CLASS (mode1) == MODE_COMPLEX_FLOAT) == (GET_MODE_CLASS (mode2) == MODE_FLOAT || GET_MODE_CLASS (mode2) == MODE_COMPLEX_FLOAT))); } Wouldn't it be better (for optimisation) to similarly require the GET_MODE_INNERs to match if either mode1 or mode2 is a scalar or complex float? I suppose that would rarely make a difference though, and in any case it shouldn't hold up the patch. Richard > + > + return true; > +} > + > #include "gt-m68k.h" > -- > 2.45.2 > > > Here's how to reproduce the bug: > > GCC configure flags: > > ../configure \ > --mandir=/usr/share/man \ > --enable-languages=c,c++,lto \ > --enable-multilib \ > --enable-multilib-space \ > --disable-gcov \ > --disable-decimal-float \ > --disable-libffi \ > --disable-libgomp \ > --disable-libmudflap \ > --disable-libquadmath \ > --disable-libssp \ > --disable-libstdcxx-pch \ > --disable-nls \ > --disable-shared \ > --disable-threads \ > --enable-tls \ > --target=m68k-unknown-elf \ > --with-system-zlib \ > --with-gnu-as \ > --with-gnu-ld \ > --without-included-gettext \ > --prefix=/opt/m68k \ > --disable-libstdcxx \ > --with-headers=no \ > --without-newlib \ > AR_FOR_TARGET=m68k-unknown-elf-ar \ > AS_FOR_TARGET=m68k-unknown-elf-as \ > LD_FOR_TARGET=m68k-unknown-elf-ld \ > NM_FOR_TARGET=m68k-unknown-elf-nm \ > OBJDUMP_FOR_TARGET=m68k-unknown-elf-objdump \ > RANLIB_FOR_TARGET=m68k-unknown-elf-ranlib \ > READELF_FOR_TARGET=m68k-unknown-elf-readelf \ > STRIP_FOR_TARGET=m68k-unknown-elf-strip \ > SED=/bin/sed \ > SHELL=/bin/sh \ > BASH=/bin/bash \ > CONFIG_SHELL=/bin/bash > > Compile flags used to build the test: > > m68k-unknown-elf-gcc -O3 -S test.c