Thanks Jeff for the confirmation and suggestions. It looks like not a corner case for the option no-signed-zero. Consider 2 sample function as below with build with option " -march=rv64gcv -mabi=lp64d -O2 -fno-signed-zeros".
void __attribute__ ((noinline)) test_float_zero_assign_0 (float *a) { *a = +0.0; } void __attribute__ ((noinline)) test_float_zero_assign_1 (float *a) { *a = -0.0; } For the first one (aka float 0.0) we have rtl as below: (insn 6 3 0 2 (set (mem:SF (reg/v/f:DI 134 [ a ]) [1 *a_2(D)+0 S4 A32]) (const_double:SF 0.0 [0x0.0p+0])) "test.c":14:6 -1 (nil)) But for the second one (aka float -0.0 with no-signed-zero) we have rtl as below but we expect const_double -0.0 here. (insn 6 3 7 2 (set (reg:DI 135 (high:DI (symbol_ref/u:DI ("*.LC0") [flags 0x82]))) "test.c":21:6 -1 (nil)) (insn 7 6 8 2 (set (reg:SF 136) (mem/u/c:SF (lo_sum:DI (reg:DI 135) (symbol_ref/u:DI ("*.LC0") [flags 0x82])) [0 S4 A32])) "test.c":21:6 -1 (nil)) I will have a try to fix it in V3. Pan -----Original Message----- From: Jeff Law <jeffreya...@gmail.com> Sent: Saturday, December 30, 2023 11:14 AM To: Li, Pan2 <pan2...@intel.com>; gcc-patches@gcc.gnu.org Cc: juzhe.zh...@rivai.ai; Wang, Yanzhang <yanzhang.w...@intel.com>; kito.ch...@gmail.com; richard.guent...@gmail.com Subject: Re: [PATCH v2] RISC-V: XFAIL pr30957-1.c when loop vectorized with variable factor On 12/28/23 22:56, Li, Pan2 wrote: > Thanks Jeff. > > I think I locate where aarch64 performs the trick here. > > 1. In the .final we have rtl like > > (insn:TI 6 8 29 (set (reg:SF 32 v0) > (const_double:SF -0.0 [-0x0.0p+0])) > "/home/box/panli/gnu-toolchain/gcc/gcc/testsuite/gcc.dg/pr30957-1.c":31:7 79 > {*movsf_aarch64} > (nil)) > > 2. the movsf_aarch64 comes from the aarch64.md file similar to the below rtl. > Aka, it will generate movi\t%0.2s, #0 if > the aarch64_reg_or_fp_zero is true. > > 1640 (define_insn "*mov<mode>_aarch64" > 1641 [(set (match_operand:SFD 0 "nonimmediate_operand") > 1642 match_operand:SFD 1 "general_operand"))] > 1643 "TARGET_FLOAT && (register_operand (operands[0], <MODE>mode) > 1644 || aarch64_reg_or_fp_zero (operands[1], <MODE>mode))" > 1645 {@ [ cons: =0 , 1 ; attrs: type , arch ] > 1646 [ w , Y ; neon_move , simd ] movi\t%0.2s, #0 > > 3. Then we will have aarch64_float_const_zero_rtx_p here, and the -0.0 input > rtl will return true in line 10873 because of no-signed-zero is given. > > 10863 bool > 10864 aarch64_float_const_zero_rtx_p (rtx x > 10865 { > 10866 /* 0.0 in Decimal Floating Point cannot be represented by #0 or > 10867 zr as our callers expect, so no need to check the actual > 10868 value if X is of Decimal Floating Point type. */ > 10869 if (GET_MODE_CLASS (GET_MODE (x)) == MODE_DECIMAL_FLOAT) > 10870 return false; > 10871 > 10872 if (REAL_VALUE_MINUS_ZERO (*CONST_DOUBLE_REAL_VALUE (x))) > 10873 return !HONOR_SIGNED_ZEROS (GET_MODE (x)); > 10874 return real_equal (CONST_DOUBLE_REAL_VALUE (x), &dconst0); > 10875 } > > I think that explain why we have +0.0 in aarch64 here. Yup. Thanks a ton for diving into this. So I think that points us to the right fix, specifically we should be turning -0.0 into 0.0 when !HONOR_SIGNED_ZEROS rather than xfailing the test. I think we'd need to adjust reg_or_0_operand and riscv_output_move, probably the G constraint as well. We might also need to adjust move_operand and perhaps riscv_legitimize_move. jeff