On 9/6/23 10:22, Palmer Dabbelt wrote:
On Wed, 06 Sep 2023 09:07:33 PDT (-0700), christoph.muell...@vrull.eu wrote:From: Christoph Müllner <christoph.muell...@vrull.eu> This patch implements the expansion of the strlen builtin for RV32/RV64for xlen-aligned aligned strings if Zbb or XTheadBb instructions are available.The inserted sequences are: rv32gc_zbb (RV64 is similar): add a3,a0,4 li a4,-1 .L1: lw a5,0(a0) add a0,a0,4 orc.b a5,a5 beq a5,a4,.L1 not a5,a5 ctz a5,a5 srl a5,a5,0x3 add a0,a0,a5 sub a0,a0,a3 rv64gc_xtheadbb (RV32 is similar): add a4,a0,8 .L2: ld a5,0(a0) add a0,a0,8 th.tstnbz a5,a5 beqz a5,.L2 th.rev a5,a5 th.ff1 a5,a5 srl a5,a5,0x3 add a0,a0,a5 sub a0,a0,a4 This allows to inline calls to strlen(), with optimized code for xlen-aligned strings, resulting in the following benefits over a call to libc: * no call/ret instructions * no stack frame allocation * no register saving/restoring * no alignment test The inlining mechanism is gated by a new switch ('-minline-strlen') and by the variable 'optimize_size'.Maybe this is more of a Jeff question, but this looks to me like something that should be target-agnostic -- maybe we need some backend work to actually emit the special instruction, but IIRC this is a somewhat common flavor of instruction and is in other ISAs as well. It looks like there's already a strlen insn, so I guess the core issue is why we need that unspec?Sorry if I'm just missing something, though...
The generic strlen expansion in GCC doesn't really expand a strlen loop. It really just calls into the target code and forces the target to handle everything.
We could have generic strlen expansion code that kicks in if the target expander fails. And we could probably create the necessary opcodes to express the optimized end-of-string comparison instructions that exist on various architectures. I'm not not sure it's worth that much effort given targets are already doing their own strlen expansions.
jeff