> OK.  And I think this shows the basic approach we want to use if there 
> are other builtins that accept sub-word modes.  ie, get the operands 
> into X mode (by extending them as appropriate), then do as much work in 
> X mode as possible, then truncate the result if needed.

> Thanks for your patience on this.

Thanks Jeff for comments and suggestions, I will have a try if we can do some 
combine-like optimization for
the SImode asm in RV64.

Pan

-----Original Message-----
From: Jeff Law <jeffreya...@gmail.com> 
Sent: Sunday, August 18, 2024 2:17 AM
To: Li, Pan2 <pan2...@intel.com>; gcc-patches@gcc.gnu.org
Cc: juzhe.zh...@rivai.ai; kito.ch...@gmail.com; rdapp....@gmail.com
Subject: Re: [PATCH v4] RISC-V: Make sure high bits of usadd operands is clean 
for non-Xmode [PR116278]



On 8/16/24 9:43 PM, pan2...@intel.com wrote:
> From: Pan Li <pan2...@intel.com>
> 
> For QI/HImode of .SAT_ADD,  the operands may be sign-extended and the
> high bits of Xmode may be all 1 which is not expected.  For example as
> below code.
> 
> signed char b[1];
> unsigned short c;
> signed char *d = b;
> int main() {
>    b[0] = -40;
>    c = ({ (unsigned short)d[0] < 0xFFF6 ? (unsigned short)d[0] : 0xFFF6; }) + 
> 9;
>    __builtin_printf("%d\n", c);
> }
> 
> After expanding we have:
> 
> ;; _6 = .SAT_ADD (_3, 9);
> (insn 8 7 9 (set (reg:DI 143)
>          (high:DI (symbol_ref:DI ("d") [flags 0x86]  <var_decl d>)))
>       (nil))
> (insn 9 8 10 (set (reg/f:DI 142)
>          (mem/f/c:DI (lo_sum:DI (reg:DI 143)
>                  (symbol_ref:DI ("d") [flags 0x86]  <var_decl d>)) [1 d+0 S8 
> A64]))
>       (nil))
> (insn 10 9 11 (set (reg:HI 144 [ _3 ])
>          (sign_extend:HI (mem:QI (reg/f:DI 142) [0 *d.0_1+0 S1 A8]))) 
> "test.c":7:10 -1
>       (nil))
> 
> The convert from signed char to unsigned short will have sign_extend rtl
> as above.  And finally become the lb insn as below:
> 
> lb      a1,0(a5)   // a1 is -40, aka 0xffffffffffffffd8
> lui     a0,0x1a
> addi    a5,a1,9
> slli    a5,a5,0x30
> srli    a5,a5,0x30 // a5 is 65505
> sltu    a1,a5,a1   // compare 65505 and 0xffffffffffffffd8 => TRUE
> 
> The sltu try to compare 65505 and 0xffffffffffffffd8 here,  but we
> actually want to compare 65505 and 65496 (0xffd8).  Thus we need to
> clean up the high bits to ensure this.
> 
> The below test suites are passed for this patch:
> * The rv64gcv fully regression test.
> 
>       PR target/116278
> 
> gcc/ChangeLog:
> 
>       * config/riscv/riscv.cc (riscv_gen_zero_extend_rtx): Add new
>       func impl to zero extend rtx.
>       (riscv_expand_usadd): Leverage above func to cleanup operands
>       and sum.
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.target/riscv/pr116278-run-1.c: New test.
>       * gcc.target/riscv/pr116278-run-2.c: New test.
> 
>       PR 116278
> 
> gcc/ChangeLog:
> 
>       * config/riscv/riscv.cc (riscv_gen_zero_extend_rtx): Add new
>       func impl to zero extend rtx.
>       (riscv_expand_usadd): Leverage above func to cleanup operands 0
>       and remove the special handing for SImode in RV64.
> 
> gcc/testsuite/ChangeLog:
> 
>       * gcc.target/riscv/sat_u_add-11.c: Adjust asm check body.
>       * gcc.target/riscv/sat_u_add-15.c: Ditto.
>       * gcc.target/riscv/sat_u_add-19.c: Ditto.
>       * gcc.target/riscv/sat_u_add-23.c: Ditto.
>       * gcc.target/riscv/sat_u_add-3.c: Ditto.
>       * gcc.target/riscv/sat_u_add-7.c: Ditto.
>       * gcc.target/riscv/sat_u_add_imm-11.c: Ditto.
>       * gcc.target/riscv/sat_u_add_imm-15.c: Ditto.
>       * gcc.target/riscv/sat_u_add_imm-3.c: Ditto.
>       * gcc.target/riscv/sat_u_add_imm-7.c: Ditto.
>       * gcc.target/riscv/pr116278-run-1.c: New test.
>       * gcc.target/riscv/pr116278-run-2.c: New test.
OK.  And I think this shows the basic approach we want to use if there 
are other builtins that accept sub-word modes.  ie, get the operands 
into X mode (by extending them as appropriate), then do as much work in 
X mode as possible, then truncate the result if needed.

Thanks for your patience on this.

Jeff

Reply via email to