https://sourceware.org/bugzilla/show_bug.cgi?id=15646

--- Comment #6 from Sourceware Commits <cvs-commit at gcc dot gnu.org> ---
The master branch has been updated by Andrew Burgess <[email protected]>:

https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;h=ccd56e22703c094e5b6d0025e28059277ed8d8c5

commit ccd56e22703c094e5b6d0025e28059277ed8d8c5
Author: Andrew Burgess <[email protected]>
Date:   Wed Oct 1 19:55:42 2025 +0100

    gdb: remove attempted type de-duplication when building gdb-index

    This commit removes the attempted de-duplication of types when
    building the gdb-index.  This commit is the natural extension of this
    earlier commit:

      commit aef36dee93bf194cb0b976a4ae49a79a736a188d
      Date:   Sun Aug 13 14:08:06 2023 +0200

          [gdb/symtab] Don't deduplicate variables in gdb-index

    Which removed the de-duplication of variables.  It is worth reading
    the earlier commit as all the justifications for that patch also
    apply to this one.

    Currently, when building the gdb-index we sort the type entries,
    moving declarations to the end of the entry list, and non-declarations
    to the front.  Then within each group, declarations, and
    non-declarations, the index entries are sorted by CU offset.

    We then emit the first entry for any given type name.

    There are two problems with this.

    First, a non-declaration entry could be a definition, but it could
    also be a typedef.  Now sure, a typedef is a type definition, but not
    necessarily a useful one.

    If we have a header file that contains:

      typedef struct foo_t foo_t;

    And a CU which makes use of 'foo_t', then the CU will include both a
    typedef and a type declaration.  The target of the typedef will be the
    declaration.  But notice, the CU will not include a type definition.

    If we have two CUs, one which only sees the above typedef and
    declaration, and another which sees the typedef and an actual type
    definition, then the final list of entries for this type's name will
    be:

      1. A typedef entry that points at the declaration.
      2. A typedef entry that points at the definition.
      3. A definition.
      4. A declaration.

    Now (4) will get sorted to the end of the entry list.  But the order
    of (1), (2), and (3) will depend on the CU offset.  If the CU which
    containing the typedef and declaration has the smallest offset,
    then (1) will be sorted to the front of the list of entries for this
    type name.  Due to the de-duplication code this means that only (1)
    will be added to the gdb-index.

    After GDB starts and parses the index, if a user references 'foo_t'
    GDB will look in the index and find just (1).  GDB loads the CU
    containing (1) and finds both the typedef and the declaration.  But
    GDB does not find the full type definition.  As a result GDB will
    display 'foo_t' as an incomplete type.

    This differs from the behaviour when no index is used.  With no index
    GDB expands the first CU containing 'foo_t', finds the typedef and
    type declaration, decides that this is not good enough and carries on.
    GDB will then expand the second CU and find the type's definition, GDB
    now has a full understanding of the type, and can print the type
    correctly.

    We could solve this problem by marking typedefs as a distinct
    sub-category of types, just as we do with declarations.  Then we could
    sort definitions to the front of the list, then typedefs, and finally,
    declarations after that.  This would, I think, mean that we always
    prefer emitting a definition for a type, which would resolve this
    first problem, or at least, it would resolve it well enough, but it
    wouldn't fix the second problem.

    The second problem is that the Python API and the 'info types' command
    can be used to query all type symbols.  As such, GDB needs to be able
    to find all the CUs which contain a given type.  Especially as it is
    possible that a type might be defined differently within different
    CUs.

    NOTE: Obviously a program doing this (defining a type differently in
      different CUs) would need to be mindful of the One Definition Rule,
      but so long as the type doesn't escape outside of a single CU then
      reusing a type name isn't, as I understand it, wrong.  And even if
      it is, the fact that it compiles, and could be a source of bugs,
      means (in my opinion) that GDB should handle this case to enable
      debugging of it.

    Even something as simple as 'info types ....' relies on GDB being able
    to find multiple entries for a given type in different CUs.  If the
    index only contains a single type entry, then this means GDB will see
    different things depending on which CUs happen to have been expanded.

    Given all of the above, I think that any attempt to remove type
    entries from the gdb-index is unsafe and can result in GDB behaving
    differently when using the gdb-index compared to using no index.

    The solution is to remove the de-duplication code, which is what this
    patch does.

    Now that we no longer need to sort declarations to the end of the
    entry list, I've removed all the code related to the special use of
    GDB_INDEX_SYMBOL_KIND_UNUSED5 (which is how we marked declarations),
    this cleans things up a little bit.

    I've also renamed some of the functions away from minimize, now that
    there's no minimization being done.

    A problem was revealed by this change.  When running the test
    gdb.cp/stub-array-size.exp with the --target_board=cc-with-gdb-index,
    I was seeing a failure using gcc 15.1.0.

    This test has two CUs, and a type 'A'.  The test description says:

      Test size of arrays of stubbed types (structures where the full
      definition is not immediately available).

    Which I don't really understand given the test's source code.  The
    type 'A' is defined in a header, which is included in both CUs.
    However, the test description does seem to be accurate; in one CU the
    type looks like this:

     <1><4a>: Abbrev Number: 8 (DW_TAG_structure_type)
        <4b>   DW_AT_name        : A
        <4d>   DW_AT_declaration : 1
        <4d>   DW_AT_sibling     : <0x6d>
     <2><51>: Abbrev Number: 9 (DW_TAG_subprogram)
        <52>   DW_AT_external    : 1
        <52>   DW_AT_name        : ~A
        <55>   DW_AT_decl_file   : 2
        <56>   DW_AT_decl_line   : 20
        <57>   DW_AT_decl_column : 11
        <58>   DW_AT_linkage_name: (indirect string, offset: 0x103): _ZN1AD4Ev
        <5c>   DW_AT_virtuality  : 1        (virtual)
        <5d>   DW_AT_containing_type: <0x4a>
        <61>   DW_AT_declaration : 1
        <61>   DW_AT_object_pointer: <0x66>
        <65>   DW_AT_inline      : 0        (not inlined)
     <3><66>: Abbrev Number: 10 (DW_TAG_formal_parameter)
        <67>   DW_AT_type        : <0x8c>
        <6b>   DW_AT_artificial  : 1
     <3><6b>: Abbrev Number: 0
     <2><6c>: Abbrev Number: 0

    while in the second CU, the type looks like this:

     <1><178>: Abbrev Number: 4 (DW_TAG_structure_type)
        <179>   DW_AT_name        : A
        <17b>   DW_AT_byte_size   : 8
        <17c>   DW_AT_decl_file   : 2
        <17d>   DW_AT_decl_line   : 18
        <17e>   DW_AT_decl_column : 8
        <17f>   DW_AT_containing_type: <0x178>
        <183>   DW_AT_sibling     : <0x1ac>
     <2><187>: Abbrev Number: 5 (DW_TAG_member)
        <188>   DW_AT_name        : (indirect string, offset: 0x19e): _vptr.A
        <18c>   DW_AT_type        : <0x1be>
        <190>   DW_AT_data_member_location: 0
        <191>   DW_AT_artificial  : 1
     <2><191>: Abbrev Number: 6 (DW_TAG_subprogram)
        <192>   DW_AT_external    : 1
        <192>   DW_AT_name        : ~A
        <195>   DW_AT_decl_file   : 1
        <196>   DW_AT_decl_line   : 20
        <197>   DW_AT_decl_column : 1
        <198>   DW_AT_linkage_name: (indirect string, offset: 0x103): _ZN1AD4Ev
        <19c>   DW_AT_virtuality  : 1       (virtual)
        <19d>   DW_AT_containing_type: <0x178>
        <1a1>   DW_AT_declaration : 1
        <1a1>   DW_AT_object_pointer: <0x1a5>
     <3><1a5>: Abbrev Number: 7 (DW_TAG_formal_parameter)
        <1a6>   DW_AT_type        : <0x1cd>
        <1aa>   DW_AT_artificial  : 1
     <3><1aa>: Abbrev Number: 0
     <2><1ab>: Abbrev Number: 0

    So, for reasons that I don't understand, the type, despite (as far as
    I can see) having its full definition available, is recorded only as
    declared in one CU.

    The test then performs some actions that rely on 'sizeof(A)' and
    expects GDB to correctly figure out the size.  This requires GDB to
    find, and expand the CU containing the real definition of 'A'.

    Prior to this patch GDB would sort the two type entries for 'A',
    placing the declaration second, and then record only one entry, the
    definition.  When it came to expansion there was only one thing to
    expand, and this is the declaration we needed.  It happens that in
    this test the definition is in the second CU, that is, the CU with the
    biggest offset.  This means that, if all index entries were considered
    equal, the definition entry would be second.  However, currently, due
    to the way GDB forces definitions to the front, the entry for the
    second CU, the definition, is placed first in the index, and with
    de-duplication, this is the only entry added to the index.

    After this patch, both the declaration and the definition are placed
    in the index, and as the declaration is in the CU at offset 0, the
    declaration is added first to the index.

    This should be fine.  When looking for 'A' GDB should expand the CU
    containing the declaration, see that all we have is a declaration, and
    so continue, next expanding the definition, at which point we're done.

    However, in read-gdb-index.c, in the function
    mapped_gdb_index::build_name_components, there is a work around for
    gold bug PR gold/15646.  Ironically, the bug here is that gold was not
    removing duplicate index entries, and it is noted that this has a
    performance impact on GDB.  A work around for this was added to GDB in
    commit:

      commit 8943b874760d9cf35b71890a70af9866e4fab2a6
      Date:   Tue Nov 12 09:43:17 2013 -0800

          Work around gold/15646.

    A test for this was added in:

      commit 40d22035a7fc239ac1e944b75a2e3ee9029d1b76
      Date:   Tue May 26 11:35:32 2020 +0200

          [gdb/testsuite] Add test-case gold-gdb-index.exp

    And the fix was tweaked in commit:

      commit f030440daa989ae3dadc1fa4342cfa16d690db3c
      Date:   Thu May 28 17:26:22 2020 +0200

          [gdb/symtab] Make gold index workaround more precise

    The problem specifically called out in the bug report is that
    namespaces can appear in multiple CUs, and that trying to complete
    'ns::misspelled' would expand every CU containing namespace 'ns' due
    to the duplicate 'ns' type symbols.

    The work around that was added in 8943b874760d9cf3 was to ignore
    duplicate global symbols when expanding entries from the index.  In
    commit f030440daa989ae3 this work around was restricted to only ignore
    duplicate type entries.  This restriction was required to allow the
    earlier de-duplication patch aef36dee93bf194c to function correctly.

    Now that I'm taking the work started in aef36dee93bf194c to its
    logical conclusion, and allowing duplicate type entries, the work
    around of ignoring duplicate global type symbols is no longer needed,
    and can be removed.

    The associated test for this, added in 40d22035a7fc239a, is also
    removed in this commit.

    To be clear; the performance issue mentioned in PR gold/15646 is now
    back again.  But my claim is that gold was right all along to include
    the duplicate index entries, and any performance hit we see as a
    result, though unfortunate, is just a consequence of doing it right.

    That doesn't mean there's not room for optimisation and improvement in
    the future, though I don't have any immediate ideas, or plans in this
    area.  It's just we can't throw out a bunch of index entries that are
    critical, and claim this as a performance optimisation.

    I am seeing some failure with this patch when using the board file
    dwarf5-fission-debug-types.  These failures all have the error:

      DWARF Error: wrong unit_type in unit header (is DW_UT_skeleton, should be
DW_UT_type) [in module ....]

    However, I ran the whole testsuite with this board, and this error
    crops up often, so I don't think this is something specific to my
    patch, so I'm choosing to ignore this.

    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=15646
    Bug: https://sourceware.org/bugzilla/show_bug.cgi?id=15035

    Approved-By: Tom Tromey <[email protected]>

-- 
You are receiving this mail because:
You are on the CC list for the bug.

Reply via email to