[Bug c++/63164] unnecessary calls to __dynamic_cast

tdebock at DRWUK dot com via Gcc-bugs Tue, 08 Jul 2025 06:21:26 -0700

https://gcc.gnu.org/bugzilla/show_bug.cgi?id=63164


Thomas de Bock <tdebock at DRWUK dot com> changed:

           What    |Removed                     |Added
----------------------------------------------------------------------------
                 CC|                            |tdebock at DRWUK dot com

--- Comment #9 from Thomas de Bock <tdebock at DRWUK dot com> ---
I managed to implement this optimization on rtti.cc:763 by simply adding a
check for if the target type is final (though ideally it should also recognize
non-final types with final destructor), then building the generic tree, that
compares the addresses of the source object and target type's vtables at
runtime. The issue with that is exactly as Florian says, this being a valid
solution depends on the source either being statically compiled or
__GXX_MERGED_TYPEINFO_NAMES being enabled (which it is not by default),
allowing for the optimization and never needing to call __dynamic_cast or
compare the type names.

Looking into the way clang implements this optimization, we can see at
CGExprCXX.cpp:2311 that it decides whether or not the optimization should be
applied:
  // If the destination is effectively final, the cast succeeds if and only
  // if the dynamic type of the pointer is exactly the destination type.
  bool IsExact = !IsDynamicCastToVoid &&
                 CGM.getCodeGenOpts().OptimizationLevel > 0 &&
                 DestRecordTy->getAsCXXRecordDecl()->isEffectivelyFinal() &&
                 CGM.getCXXABI().shouldEmitExactDynamicCast(DestRecordTy);
with ItaniumCXXABI.cpp:225:
  bool shouldEmitExactDynamicCast(QualType DestRecordTy) override {
    return hasUniqueVTablePointer(DestRecordTy);
  }
and ItaniumCXXABI.cpp:194:
  bool hasUniqueVTablePointer(QualType RecordTy) {
    const CXXRecordDecl *RD = RecordTy->getAsCXXRecordDecl();
    // Under -fapple-kext, multiple definitions of the same vtable may be
    // emitted.
    if (!CGM.getCodeGenOpts().AssumeUniqueVTables ||
        getContext().getLangOpts().AppleKext)
      return false;
    // If the type_info* would be null, the vtable might be merged with that of
    // another type.
    if (!CGM.shouldEmitRTTI())
      return false;
    // If there's only one definition of the vtable in the program, it has a
    // unique address.
    if (!llvm::GlobalValue::isWeakForLinker(CGM.getVTableLinkage(RD)))
      return true;
    // Even if there are multiple definitions of the vtable, they are required
    // by the ABI to use the same symbol name, so should be merged at load
    // time. However, if the class has hidden visibility, there can be
    // different versions of the class in different modules, and the ABI
    // library might treat them as being the same.
    if (CGM.GetLLVMVisibility(RD->getVisibility()) !=
        llvm::GlobalValue::DefaultVisibility)
      return false;
    return true;
  }

As Florian said, with this code, would there not still be a case in which the
optimization is applied but fails(?):
When loading a library with dlopen and RTLD_LOCAL, since the types cannot be
resolved, two equivalent types (one from the executable, one from the library
loaded with dlopen) will not share a vtable, causing the optimized dynamic_cast
code to incorrectly decide the types are not the same.

If this is not the case could we not replicate clang's behaviour by checking
the value of __GXX_MERGED_TYPEINFO_NAMES around rtti.cc:763, the typeinfo
header can simply check the __GXX_MERGED_TYPEINFO_NAMES value and adjust its
implementation accordingly, but:

- Is it, at the point where the generic tree for dynamic_cast (or hopefully the
optimization) is constructed, possible anymore to check for the value of the
__GXX_MERGED_TYPEINFO_NAMES preprocessor directive?
- If not, would implementing the optimization by adding an additional function
(maybe in the typeinfo header) that handles these final type dynamic_casts that
checks the value of __GXX_MERGED_TYPEINFO_NAMES, then decides based on that
whether to simply compare vtable ptrs or compare the type names too, be
realistic?

[Bug c++/63164] unnecessary calls to __dynamic_cast

Reply via email to