https://gcc.gnu.org/bugzilla/show_bug.cgi?id=91771

            Bug ID: 91771
           Summary: Optimization fails to inline final override.
           Product: gcc
           Version: 10.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c++
          Assignee: unassigned at gcc dot gnu.org
          Reporter: carlo at gcc dot gnu.org
  Target Milestone: ---

Compiling the following code snippet:

struct Base
{
  int foo(int n) { return do_foo(n); }
  virtual int do_foo(int n) = 0;
};

struct Derived : public Base
{
  int do_foo(int n) override final { return n + 2; }
};

int f(Derived& d)
{
  return d.do_foo(40);
}

with g++ -S -O2 f.cxx

results correctly in the assembly code:

_Z1fR7Derived:
.LFB2:
        .cfi_startproc
        movl    $42, %eax
        ret

This does obviously not happen when the 'final' keyword is removed.

However, when we change f() to return foo instead of do_foo:

  return d.foo(40);

the assembly code of f() changes to:

_Z1fR7Derived:
.LFB2:
        .cfi_startproc
        movq    (%rdi), %rax
        leaq    _ZN7Derived6do_fooEi(%rip), %rdx
        movq    (%rax), %rax
        cmpq    %rdx, %rax
        jne     .L5
        movl    $42, %eax
        ret

In other words, it failed to do the inlining.

The reason I find this bad is because of std::pmr::memory_resource
which follows this exact pattern,

  class memory_resource
  {
...
    void*
    allocate(size_t __bytes, size_t __alignment = _S_max_align)
    { return do_allocate(__bytes, __alignment); }

    void
    deallocate(void* __p, size_t __bytes, size_t __alignment = _S_max_align)
    { return do_deallocate(__p, __bytes, __alignment); }
...
    virtual void*
    do_allocate(size_t __bytes, size_t __alignment) = 0;

    virtual void
    do_deallocate(void* __p, size_t __bytes, size_t __alignment) = 0;
...
  };

I'd really like to use std::pmr::memory_resource at the moment, but
only when the compiler will do the above optimization; then I can
specify 'final' for the do_allocate and do_deallocate of my ultra fast
pool memory allocators and get rid of the indirection of the virtual
functions by making sure the caller has the right type (which is normally
the case for the lowest level memory resource classes; only
'upstream' classes will be called through the memory_resource::allocate()
member function of the base class, in which we're already one 'level' higher in
the memory resource hierarchy, so speed isn't as much as a requirement anymore.

Reply via email to