Keith Packard <kei...@keithp.com> writes:
>> insn_propagation would previously only replace (reg:M H) with X
>> for some hard register H if the uses of H were also in mode M.
>> This patch extends it to handle simple mode punning too.
>
> (this is from GCC commit 9d20529d94b23275885f380d155fe8671ab5353a)
>
> Looks like this breaks m68k floating point conversion handling. Recall
> that m68k has only 80-bit float hardware with regular numbered registers
> (unlike 8087). Operations are done in 80-bit format, the FPU provides
> fmove.d and fmove.s operations to convert between 80-bit and
> 32-bit/64-bit values in memory.
>
>         float naf(float, float);
>         int prf(char *, ...);
>         float sqf(float);
>         int func(void);
>
>         int func(void)
>         {
>             float x = naf(4.0f, 5.0f);
>             prf("testing sqrtf for value %a\n", (double) x);
>             float y = sqf(x);
>             prf("sqrtf is %a\n", (double) y);
>             return 0;
>         }
>
> In this example, the return value from naf is passed as a double to prf
> and also passed as a float to sqf. The value is saved on the stack
> across the prf call.
>
> In the old code, it was restored to %fp0 and then pushed as a float on
> the stack to sqf:
>
>       lea prf,%a2
>       fmove.x %fp0,24(%sp)
>       jsr (%a2)
>       fmove.x 24(%sp),%fp0
>       fmove.s %fp0,-(%sp)
>       jsr sqf
>
> With this patch, the low 32 bits of the value (which is garbage) gets
> pushed on the stack instead:
>
>       lea prf,%a2
>       fmove.x %fp0,24(%sp)
>       jsr (%a2)
>       move.l 32(%sp),-(%sp)
>       jsr sqf
>
> Here's the broken code:
>
>       rtx newval = to;
>       if (GET_MODE (x) != GET_MODE (from))
>       {
>         gcc_assert (REG_P (x) && HARD_REGISTER_P (x));
>         if (REG_NREGS (x) != REG_NREGS (from)
>             || !REG_CAN_CHANGE_MODE_P (REGNO (x), GET_MODE (from),
>                                        GET_MODE (x)))
>           return false;
>         newval = simplify_subreg (GET_MODE (x), to, GET_MODE (from),
>                                   subreg_lowpart_offset (GET_MODE (x),
>                                                          GET_MODE (from)));
>         if (!newval)
>           return false;
>       }
>
> The trouble is that REG_CAN_CHANGE_MODE_P(%fp0, E_XFmode, E_SFmode)
> returns true, and then simplify_subreg goes and extracts the low 32-bits
> of the 80-bit value directly, skipping the conversion. This is the same
> as i386, so let's adapt the i386 version of can_change_mode_class. See
> attached patch, which seems to fix the bug for me.
>
> From 862467c578545a4a7ed83767df2a181e7d845bc6 Mon Sep 17 00:00:00 2001
> From: Keith Packard <kei...@keithp.com>
> Date: Sat, 28 Dec 2024 23:38:31 -0800
> Subject: [PATCH] m68k: TARGET_CAN_CHANGE_MODE_CLASS is false for FP regs
>
> Just like i386, you cannot extract float or double values by using
> the low bits of m68k float registers. Override this hook and check
> that case.
>
> Signed-off-by: Keith Packard <kei...@keithp.com>

Thanks, this looks like the right fix to me.  I'm not sure off-hand
whether it's worth continuing to allow subregs between (say) SF and CSF.
If so...

> ---
>  gcc/config/m68k/m68k.cc | 22 ++++++++++++++++++++++
>  1 file changed, 22 insertions(+)
>
> diff --git a/gcc/config/m68k/m68k.cc b/gcc/config/m68k/m68k.cc
> index 050ac096e55..54bcfaffa72 100644
> --- a/gcc/config/m68k/m68k.cc
> +++ b/gcc/config/m68k/m68k.cc
> @@ -200,6 +200,7 @@ static void m68k_asm_final_postscan_insn (FILE *, 
> rtx_insn *insn, rtx [], int);
>  static HARD_REG_SET m68k_zero_call_used_regs (HARD_REG_SET);
>  static machine_mode m68k_c_mode_for_floating_type (enum tree_index);
>  static bool m68k_use_lra_p (void);
> +static bool m68k_can_change_mode_class (machine_mode from, machine_mode to, 
> reg_class_t rclass);
>  
>  /* Initialize the GCC target structure.  */
>  
> @@ -370,6 +371,9 @@ static bool m68k_use_lra_p (void);
>  #undef TARGET_C_MODE_FOR_FLOATING_TYPE
>  #define TARGET_C_MODE_FOR_FLOATING_TYPE m68k_c_mode_for_floating_type
>  
> +#undef  TARGET_CAN_CHANGE_MODE_CLASS
> +#define TARGET_CAN_CHANGE_MODE_CLASS m68k_can_change_mode_class
> +
>  TARGET_GNU_ATTRIBUTES (m68k_attribute_table,
>  {
>    /* { name, min_len, max_len, decl_req, type_req, fn_type_req,
> @@ -7269,4 +7273,22 @@ m68k_use_lra_p ()
>    return m68k_lra_p;
>  }
>  
> +/* Implement TARGET_CAN_CHANGE_MODE_CLASS.  */
> +
> +static bool
> +m68k_can_change_mode_class (machine_mode from,
> +                         machine_mode to,
> +                         reg_class_t rclass)
> +{
> +  if (from == to)
> +    return true;
> +
> +  /* 68881 registers can't do subreg at all, as all values are reformatted
> +     to extended precision.  */
> +  if (reg_classes_intersect_p(rclass, FP_REGS))
> +    return false;

...that could be handled by adding:

  && GET_MODE_INNER (from) != GET_MODE_INNER (to)

to the return false condition.

Andreas (cc:ed) would know better tham me though.

Given the restriction you describe, I was surprised to see that m68k
defines MODES_TIEABLE_P as:

static bool
m68k_modes_tieable_p (machine_mode mode1, machine_mode mode2)
{
  return (!TARGET_HARD_FLOAT
          || ((GET_MODE_CLASS (mode1) == MODE_FLOAT
               || GET_MODE_CLASS (mode1) == MODE_COMPLEX_FLOAT)
              == (GET_MODE_CLASS (mode2) == MODE_FLOAT
                  || GET_MODE_CLASS (mode2) == MODE_COMPLEX_FLOAT)));
}

Wouldn't it be better (for optimisation) to similarly require the
GET_MODE_INNERs to match if either mode1 or mode2 is a scalar or complex
float?  I suppose that would rarely make a difference though, and in any
case it shouldn't hold up the patch.

Richard


> +
> +  return true;
> +}
> +
>  #include "gt-m68k.h"
> -- 
> 2.45.2
>
>
> Here's how to reproduce the bug:
>
> GCC configure flags:
>
>         ../configure \
>           --mandir=/usr/share/man \
>           --enable-languages=c,c++,lto \
>           --enable-multilib \
>           --enable-multilib-space \
>           --disable-gcov \
>           --disable-decimal-float \
>           --disable-libffi \
>           --disable-libgomp \
>           --disable-libmudflap \
>           --disable-libquadmath \
>           --disable-libssp \
>           --disable-libstdcxx-pch \
>           --disable-nls \
>           --disable-shared \
>           --disable-threads \
>           --enable-tls \
>           --target=m68k-unknown-elf \
>           --with-system-zlib \
>           --with-gnu-as \
>           --with-gnu-ld \
>           --without-included-gettext \
>           --prefix=/opt/m68k \
>           --disable-libstdcxx \
>           --with-headers=no \
>           --without-newlib \
>           AR_FOR_TARGET=m68k-unknown-elf-ar \
>           AS_FOR_TARGET=m68k-unknown-elf-as \
>           LD_FOR_TARGET=m68k-unknown-elf-ld \
>           NM_FOR_TARGET=m68k-unknown-elf-nm \
>           OBJDUMP_FOR_TARGET=m68k-unknown-elf-objdump \
>           RANLIB_FOR_TARGET=m68k-unknown-elf-ranlib \
>           READELF_FOR_TARGET=m68k-unknown-elf-readelf \
>           STRIP_FOR_TARGET=m68k-unknown-elf-strip \
>           SED=/bin/sed \
>           SHELL=/bin/sh \
>           BASH=/bin/bash \
>           CONFIG_SHELL=/bin/bash
>
> Compile flags used to build the test:
>
>         m68k-unknown-elf-gcc -O3 -S test.c

Reply via email to