RE: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

Li, Pan2 Mon, 01 Jan 2024 00:57:05 -0800

Thanks Jeff for the confirmation and suggestions. It looks like not a corner 
case for the option no-signed-zero.
Consider 2 sample function as below with build with option " -march=rv64gcv 
-mabi=lp64d -O2 -fno-signed-zeros".


void
__attribute__ ((noinline))
test_float_zero_assign_0 (float *a)
{
  *a = +0.0;
}

void
__attribute__ ((noinline))
test_float_zero_assign_1 (float *a)
{
  *a = -0.0;
}

For the first one (aka float 0.0) we have rtl as below:
(insn 6 3 0 2 (set (mem:SF (reg/v/f:DI 134 [ a ]) [1 *a_2(D)+0 S4 A32])
        (const_double:SF 0.0 [0x0.0p+0])) "test.c":14:6 -1
     (nil))

But for the second one (aka float -0.0 with no-signed-zero) we have rtl as 
below but we expect const_double -0.0 here.
(insn 6 3 7 2 (set (reg:DI 135
        (high:DI (symbol_ref/u:DI ("*.LC0") [flags 0x82]))) "test.c":21:6 -1
     (nil))
(insn 7 6 8 2 (set (reg:SF 136)
        (mem/u/c:SF (lo_sum:DI (reg:DI 135)
                (symbol_ref/u:DI ("*.LC0") [flags 0x82])) [0  S4 A32])) 
"test.c":21:6 -1
     (nil))

I will have a try to fix it in V3.

Pan

-----Original Message-----
From: Jeff Law <jeffreya...@gmail.com> 
Sent: Saturday, December 30, 2023 11:14 AM
To: Li, Pan2 <pan2...@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang <yanzhang.w...@intel.com>; 
kito.ch...@gmail.com; richard.guent...@gmail.com
Subject: Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with 
variable factor



On 12/28/23 22:56, Li, Pan2 wrote:
> Thanks Jeff.
> 
> I think I locate where aarch64 performs the trick here.
> 
> 1. In the .final we have rtl like
> 
> (insn:TI 6 8 29 (set (reg:SF 32 v0)
>          (const_double:SF -0.0 [-0x0.0p+0])) 
> "/home/box/panli/gnu-toolchain/gcc/gcc/testsuite/gcc.dg/pr30957-1.c":31:7 79 
> {*movsf_aarch64}
>       (nil))
> 
> 2. the movsf_aarch64 comes from the aarch64.md file similar to the below rtl. 
> Aka, it will generate movi\t%0.2s, #0 if
> the aarch64_reg_or_fp_zero is true.
> 
> 1640 (define_insn "*mov<mode>_aarch64"
> 1641   [(set (match_operand:SFD 0 "nonimmediate_operand")
> 1642       match_operand:SFD 1 "general_operand"))]
> 1643   "TARGET_FLOAT && (register_operand (operands[0], <MODE>mode)
> 1644     || aarch64_reg_or_fp_zero (operands[1], <MODE>mode))"
> 1645   {@ [ cons: =0 , 1   ; attrs: type , arch  ]
> 1646      [ w        , Y   ; neon_move   , simd  ] movi\t%0.2s, #0
> 
> 3. Then we will have aarch64_float_const_zero_rtx_p here, and the -0.0 input 
> rtl will return true in line 10873 because of no-signed-zero is given.
> 
> 10863 bool
> 10864 aarch64_float_const_zero_rtx_p (rtx x
> 10865 {
> 10866   /* 0.0 in Decimal Floating Point cannot be represented by #0 or
> 10867      zr as our callers expect, so no need to check the actual
> 10868      value if X is of Decimal Floating Point type.  */
> 10869   if (GET_MODE_CLASS (GET_MODE (x)) == MODE_DECIMAL_FLOAT)
> 10870     return false;
> 10871
> 10872   if (REAL_VALUE_MINUS_ZERO (*CONST_DOUBLE_REAL_VALUE (x)))
> 10873     return !HONOR_SIGNED_ZEROS (GET_MODE (x));
> 10874   return real_equal (CONST_DOUBLE_REAL_VALUE (x), &dconst0);
> 10875 }
> 
> I think that explain why we have +0.0 in aarch64 here.
Yup.  Thanks a ton for diving into this.  So I think that points us to 
the right fix, specifically we should be turning -0.0 into 0.0 when 
!HONOR_SIGNED_ZEROS rather than xfailing the test.

I think we'd need to adjust reg_or_0_operand and riscv_output_move, 
probably the G constraint as well.   We might also need to adjust 
move_operand and perhaps riscv_legitimize_move.

jeff

RE: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor

Reply via email to