https://gcc.gnu.org/bugzilla/show_bug.cgi?id=121569

            Bug ID: 121569
           Summary: Duplicate static symbols in .symtab causing incorrect
                    USDT argument resolution in GCC with -O1
           Product: gcc
           Version: 13.3.0
            Status: UNCONFIRMED
          Severity: normal
          Priority: P3
         Component: c
          Assignee: unassigned at gcc dot gnu.org
          Reporter: phoenix500526 at 163 dot com
  Target Milestone: ---

Created attachment 62128
  --> https://gcc.gnu.org/bugzilla/attachment.cgi?id=62128&action=edit
The pre-process file of usdt_rip.c. Generated by GCC 13 with -save-temps -g -O1
options.

Hi GCC maintainers,

I am Zhao Jiawei (GitHub: Phoenix500526), a member of the bpftrace community.
While adding support for SIB-addressing USDT argument specs in libbpf, I
encountered a potential bug in GCC’s handling of -O1 optimization.

Issue Summary
When compiling with GCC 13 at -O1, the generated USDT argument spec may contain
a RIP-relative address, e.g. "-1@ti(%rip)"

To resolve this argument, libbpf looks up the ti symbol in .symtab to get its
address/offset.

However, I found that GCC may emit multiple identical symbol names in .symtab
when different source files each define a static volatile variable with the
same name.
For example:

* usdt_rip.c defines static volatile char ti = 0;
* test/usdt_rip.c also defines static volatile char ti = 0;

The test/usdt_rip.c file:
```C
static volatile char ti = 0;

```

The usdt_rip.c file:
```C
#include "sdt.h"

static volatile char ti = 0;

static inline void __attribute__((always_inline)) trigger_func() {
  STAP_PROBE1(usdt_rip, rip_global_var, ti);
}

int main() {
  trigger_func();
  return 0;
}
```

In this case, .symtab contains two ti entries, making it impossible for USDT to
determine which one the argument spec refers to.

```bash
$ gcc -g -O1 usdt_rip.c test/usdt_rip.c -o usdt_rip
$ readelf -n usdt_rip
Displaying notes found in: .note.gnu.property
  Owner                Data size        Description
  GNU                  0x00000020       NT_GNU_PROPERTY_TYPE_0
      Properties: x86 feature: IBT, SHSTK
        x86 ISA needed: x86-64-baseline

Displaying notes found in: .note.gnu.build-id
  Owner                Data size        Description
  GNU                  0x00000014       NT_GNU_BUILD_ID (unique build ID
bitstring)
    Build ID: 1797160a0c58212aba3c03accafea51c673c8f5b

Displaying notes found in: .note.ABI-tag
  Owner                Data size        Description
  GNU                  0x00000010       NT_GNU_ABI_TAG (ABI version tag)
    OS: Linux, ABI: 3.2.0

Displaying notes found in: .note.stapsdt
  Owner                Data size        Description
  stapsdt              0x0000003c       NT_STAPSDT (SystemTap probe
descriptors)
    Provider: usdt_rip
    Name: rip_global_var
    Location: 0x000000000000112d, Base: 0x0000000000002004, Semaphore:
0x0000000000000000
    Arguments: -1@ti(%rip)

$ readelf -s usdt_rip | grep ti
    12: 0000000000004011     1 OBJECT  LOCAL  DEFAULT   25 ti
    14: 0000000000004012     1 OBJECT  LOCAL  DEFAULT   25 ti
```
The address or offset of these two variables are too close to distinguish. 


# Observed Behavior
* GCC 13 (-O1): Generates multiple ti symbols in .symtab, argument spec points
to %rip offset, making resolution ambiguous.
* Clang 18.1.3 (-O1): Either optimizes away unused ti or produces unambiguous
stack-relative specs like -1@-1(%rsp), and .symtab contains only one ti.


```bash
$ clang -g -O1 usdt_rip.c test/usdt_rip.c -o usdt_rip.clang
$ readelf -n usdt_rip.clang 

Displaying notes found in: .note.gnu.property
  Owner                Data size        Description
  GNU                  0x00000010       NT_GNU_PROPERTY_TYPE_0
      Properties: x86 ISA needed: x86-64-baseline

Displaying notes found in: .note.gnu.build-id
  Owner                Data size        Description
  GNU                  0x00000014       NT_GNU_BUILD_ID (unique build ID
bitstring)
    Build ID: c2e4f397f10c22ed47839678f7e157e47902cef9

Displaying notes found in: .note.ABI-tag
  Owner                Data size        Description
  GNU                  0x00000010       NT_GNU_ABI_TAG (ABI version tag)
    OS: Linux, ABI: 3.2.0

Displaying notes found in: .note.stapsdt
  Owner                Data size        Description
  stapsdt              0x0000003c       NT_STAPSDT (SystemTap probe
descriptors)
    Provider: usdt_rip
    Name: rip_global_var
    Location: 0x000000000000113b, Base: 0x0000000000002004, Semaphore:
0x0000000000000000
    Arguments: -1@-1(%rsp)

$ readelf -s usdt_rip.clang | grep ti
    12: 0000000000004011     1 OBJECT  LOCAL  DEFAULT   26 ti
```

# Impact
This behavior prevents correct USDT argument resolution in libbpf and bpftrace,
making some probes unusable when compiled with GCC.

Since I lack in-depth knowledge of compilers, I’m not sure whether this is a
GCC issue or an ld issue. If it’s related to ld, I can bring it to the ld
community. I’d also like to know whether GCC could, like Clang, replace the
symbol `ti` with its corresponding address or offset when generating the USDT
argument spec, or otherwise provide a way to distinguish between the two
different symbols, so that libbpf can correctly resolve the intended USDT
argument. 

Best regards,
Zhao Jiawei

Reply via email to