On 9/6/23 10:22, Palmer Dabbelt wrote:
On Wed, 06 Sep 2023 09:07:33 PDT (-0700), christoph.muell...@vrull.eu wrote:
From: Christoph Müllner <christoph.muell...@vrull.eu>

This patch implements the expansion of the strlen builtin for RV32/RV64
for xlen-aligned aligned strings if Zbb or XTheadBb instructions are available.
The inserted sequences are:

rv32gc_zbb (RV64 is similar):
      add     a3,a0,4
      li      a4,-1
.L1:  lw      a5,0(a0)
      add     a0,a0,4
      orc.b   a5,a5
      beq     a5,a4,.L1
      not     a5,a5
      ctz     a5,a5
      srl     a5,a5,0x3
      add     a0,a0,a5
      sub     a0,a0,a3

rv64gc_xtheadbb (RV32 is similar):
      add       a4,a0,8
.L2:  ld        a5,0(a0)
      add       a0,a0,8
      th.tstnbz a5,a5
      beqz      a5,.L2
      th.rev    a5,a5
      th.ff1    a5,a5
      srl       a5,a5,0x3
      add       a0,a0,a5
      sub       a0,a0,a4

This allows to inline calls to strlen(), with optimized code for
xlen-aligned strings, resulting in the following benefits over
a call to libc:
* no call/ret instructions
* no stack frame allocation
* no register saving/restoring
* no alignment test

The inlining mechanism is gated by a new switch ('-minline-strlen')
and by the variable 'optimize_size'.

Maybe this is more of a Jeff question, but this looks to me like something that should be target-agnostic -- maybe we need some backend work to actually emit the special instruction, but IIRC this is a somewhat common flavor of instruction and is in other ISAs as well.  It looks like there's already a strlen insn, so I guess the core issue is why we need that unspec?

Sorry if I'm just missing something, though...

The generic strlen expansion in GCC doesn't really expand a strlen loop. It really just calls into the target code and forces the target to handle everything.


We could have generic strlen expansion code that kicks in if the target expander fails. And we could probably create the necessary opcodes to express the optimized end-of-string comparison instructions that exist on various architectures. I'm not not sure it's worth that much effort given targets are already doing their own strlen expansions.

jeff

Reply via email to