https://gcc.gnu.org/bugzilla/show_bug.cgi?id=103798

            Bug ID: 103798
           Summary: Missed optimization: char_traits<char>::find (and thus
                    string_view::find_first_of) is slow when invoked with
                    short strings
           Product: gcc
           Version: 11.2.1
            Status: UNCONFIRMED
          Keywords: missed-optimization
          Severity: normal
          Priority: P3
         Component: libstdc++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: zhaoweiliew at gmail dot com
  Target Milestone: ---

Given the following code:

#include <string_view>

size_t findFirstE_slow(std::string_view sv) {
  return sv.find_first_of("eE");
}

size_t findFirstE_fast(std::string_view sv) {
  auto it{sv.begin()};
  for (; it != sv.end() && *it != 'e' && *it != 'E'; ++it)
    ;
  return it == sv.end() ? std::string_view::npos : size_t(it - sv.begin());
}


x86-64 gcc (trunk) -std=c++20 -O3 generates the following assembly:

.LC0:
        .string "eE"
findFirstE_slow(std::basic_string_view<char, std::char_traits<char> >):
        push    r12
        push    rbp
        push    rbx
        test    rdi, rdi
        je      .L4
        mov     rbx, rdi
        mov     rbp, rsi
        xor     r12d, r12d
        jmp     .L3
.L8:
        add     r12, 1
        cmp     rbx, r12
        je      .L4
.L3:
        movsx   esi, BYTE PTR [rbp+0+r12]
        mov     edx, 2
        mov     edi, OFFSET FLAT:.LC0
        call    memchr
        test    rax, rax
        je      .L8
        mov     rax, r12
        pop     rbx
        pop     rbp
        pop     r12
        ret
.L4:
        mov     r12, -1
        pop     rbx
        pop     rbp
        mov     rax, r12
        pop     r12
        ret

findFirstE_fast(std::basic_string_view<char, std::char_traits<char> >):
        add     rdi, rsi
        cmp     rdi, rsi
        je      .L13
        mov     rax, rsi
        jmp     .L12
.L15:
        add     rax, 1
        cmp     rdi, rax
        je      .L13
.L12:
        movzx   edx, BYTE PTR [rax]
        and     edx, -33
        cmp     dl, 69
        jne     .L15
        sub     rax, rsi
        ret
.L13:
        mov     rax, -1
        ret


Even though both findFirstE_slow() and findFirstE_fast() are logically
equivalent, findFirstE_slow() calls memchr() for every character in the input
string.

findFirstE_fast() does what we would reasonably expect, comparing every
character in the input string with 'e' and 'E'.

In practice, when the set of characters to be found is small, findFirstE_slow()
runs noticeably slower than findFirstE_fast(). This seems to be a missed
optimization in char_traits<char>::find(), which affects several find-related
methods in string_view.

Relevant StackOverflow question with quick-bench output, Compiler Explorer
output, and more discussion:
https://stackoverflow.com/questions/70433152/missed-optimization-with-string-viewfind-first-of
  • [Bug libstdc++/103798] New: Miss... zhaoweiliew at gmail dot com via Gcc-bugs

Reply via email to