So another minor improvement for bitmanip code generation.
Essentially we have a pattern which matches a bset idiom for x = zero_extend (1 << n). That pattern only handles SI->DI extension. For the QI/HI case the 1<<n is first in a narrowing subreg to QI/HI. ie (zero_extend:DI (subreg:QI (ashift (...))))The same principles apply to this case as it can be implemented with bset target,x0,bitpos and by using x0 we'll get the desired zero extension.
I think this testcase is ultimately derived from 500.perlbench. Our code for this testcase still isn't great, but this is an easy improvement and makes one of the remaining inefficiencies more obvious:
bset a5,x0,a5 # 24 [c=8 l=4] *bsetdi_3 andn a3,a0,a5 # 52 [c=4 l=4] and_notdi3 beq a4,zero,.L3 # 41 [c=12 l=4] *branchdi or a3,a0,a5 # 44 [c=4 l=4] *iordi3/0
The bset is what this patch generates instead of a li+sll sequence. In the form above its easier see that the andn can be replaced with a bclr and the or can be replaced with a bset which in turn would allow the bset above to go away completely.
This has been tested in my tester for rv32 and rv64. I'll wait for pre-commit testing to complete before moving forward.
jeff
gcc/ * config/riscv/bitmanip.md (bset<X:mode>_3): New pattern. testsuite/ * gcc.target/riscv/zbs-bset-2.c: New test. diff --git a/gcc/config/riscv/bitmanip.md b/gcc/config/riscv/bitmanip.md index f403ba8dbba..b5e052fe4f2 100644 --- a/gcc/config/riscv/bitmanip.md +++ b/gcc/config/riscv/bitmanip.md @@ -611,6 +611,18 @@ (define_insn "*bsetdi_2" "bset\t%0,x0,%1" [(set_attr "type" "bitmanip")]) +;; Similar, but we have a narrowing SUBREG. We're still using x0 as +;; a source, so the result is still zero extended. +(define_insn "*bset<X:mode>_3" + [(set (match_operand:X 0 "register_operand" "=r") + (zero_extend:X + (subreg:SHORT + (ashift:X (const_int 1) + (match_operand:QI 1 "register_operand" "r")) 0)))] + "TARGET_ZBS" + "bset\t%0,x0,%1" + [(set_attr "type" "bitmanip")]) + ;; These two splitters take advantage of the limited range of the ;; shift constant. With the limited range we know the SImode sign ;; bit is never set, thus we can treat this as zero extending and diff --git a/gcc/testsuite/gcc.target/riscv/zbs-bset-2.c b/gcc/testsuite/gcc.target/riscv/zbs-bset-2.c new file mode 100644 index 00000000000..8555e3784ff --- /dev/null +++ b/gcc/testsuite/gcc.target/riscv/zbs-bset-2.c @@ -0,0 +1,31 @@ +/* { dg-do compile } */ +/* { dg-options "-O2 -march=rv64gc_zba_zbb_zbs -mabi=lp64d" { target { riscv64*-*-* } } } */ +/* { dg-options "-O2 -march=rv32gc_zba_zbb_zbs -mabi=ilp32" { target { riscv32*-*-* } } } */ + +typedef struct SHA { + unsigned char block[128]; + unsigned int blockcnt; + +} SHA; + +#define BITSET(s, pos) s[(pos) >> 3] & (char) (0x01 << (7 - (pos) % 8)) +#define SETBIT(s, pos) s[(pos) >> 3] |= (char) (0x01 << (7 - (pos) % 8)) +#define CLRBIT(s, pos) s[(pos) >> 3] &= (char) ~(0x01 << (7 - (pos) % 8)) + +#define ULNG unsigned long + +void shabits(char *bitstr, long bitcnt, SHA *s, ULNG i) +{ + if (BITSET(bitstr, i)) + SETBIT(s->block, s->blockcnt); + else + CLRBIT(s->block, s->blockcnt); +} + +/* { dg-final { scan-assembler-times "bset\t\[a-x\]\[0-9\]+.x0" 2 { target { riscv64*-*-* } } } } */ + +/* rv32 doesn't have the first bset for some reason, probably a missed + generalization of one of the other bitmanip combiner patterns. The + bset we do generate corresponds to the one the related patch generates. */ +/* { dg-final { scan-assembler-times "bset\t\[a-x\]\[0-9\]+.x0" 1 { target { riscv32*-*-* } } } } */ +