https://github.com/open-telemetry/opentelemetry-go-contrib/issues/6625
On Wednesday, January 15, 2025 at 11:02:37 PM UTC-8 John wrote:
> Thanks Kurtis for the advice. I was heading in that direction.
>
> This is definitely an OTEL problem. The minimal version required to
> create the issue:
>
> metrics.go
> ```go
> package metrics
>
> import (
> _ "go.opentelemetry.io/contrib/instrumentation/host"
> )
> ```
>
> metrics_test.go
> ```go
> package metrics
> ```
>
> `go test -race`
>
> That will immediately cause the issue. You don't even require tests, it
> fails before it even gets there.
>
> I'll make my way over to the OTEL bugs tomorrow.
>
> For those that are interested in some random debugger output, here is a
> little from lldb and delve (which let's me see they are calling C from
> purego):
>
> Process 58447 launched: '/Users/jdoak/base/concurrency/sync/sync.test'
> (arm64)
> warning: (arm64)
> /Users/jdoak/base/concurrency/sync/sync.test(0x0000000100000000) address
> 0x0000000100000000 maps to more than one section: sync.test.__TEXT and
> sync.test.__TEXT
> warning: (arm64)
> /Users/jdoak/base/concurrency/sync/sync.test(0x0000000100000000) address
> 0x0000000101bbc000 maps to more than one section: sync.test.__DATA_CONST
> and sync.test.__DATA_CONST
> warning: (arm64)
> /Users/jdoak/base/concurrency/sync/sync.test(0x0000000100000000) address
> 0x0000000102b18000 maps to more than one section: sync.test.__DATA and
> sync.test.__DATA
> Process 58447 stopped
> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS
> (code=1, address=0x10)
> frame #0: 0x000000010000423c sync.test`__tsan_func_enter + 16
> sync.test`__tsan_func_enter:
> -> 0x10000423c <+16>: ldr x8, [x0, #0x10]
> 0x100004240 <+20>: add w9, w8, #0x8
> 0x100004244 <+24>: tst x9, #0xff0
> 0x100004248 <+28>: b.eq 0x1000042a0 ; <+116>
> Target 0: (sync.test) stopped.
> (lldb) thread backtrace all
> * thread #1, queue = 'com.apple.main-thread', stop reason = EXC_BAD_ACCESS
> (code=1, address=0x10)
> * frame #0: 0x000000010000423c sync.test`__tsan_func_enter + 16
> frame #1: 0x0000000101706e34 sync.test`
> github.com/ebitengine/purego/internal/fakecgo.x_cgo_notify_runtime_init_done
> + 20
> frame #2: 0x00000001017073f0
> sync.test`x_cgo_notify_runtime_init_done_trampoline + 16
> thread #2
> frame #0: 0x00000001945e64e8 libsystem_kernel.dylib`__semwait_signal +
> 8
> frame #1: 0x00000001944c56f0 libsystem_c.dylib`nanosleep + 220
> frame #2: 0x00000001944c5608 libsystem_c.dylib`usleep + 68
> frame #3: 0x00000001000c6304 sync.test`runtime.usleep_trampoline.abi0
> + 20
> thread #3
> frame #0: 0x00000001945e66ec libsystem_kernel.dylib`__psynch_cvwait + 8
> frame #1: 0x0000000194624894
> libsystem_pthread.dylib`_pthread_cond_wait + 1204
> frame #2: 0x00000001000c6688
> sync.test`runtime.pthread_cond_wait_trampoline.abi0 + 24
> frame #3: 0x00000001000c4838 sync.test`runtime.asmcgocall.abi0 + 200
> thread #4
> frame #0: 0x00000001945e66ec libsystem_kernel.dylib`__psynch_cvwait + 8
> frame #1: 0x0000000194624894
> libsystem_pthread.dylib`_pthread_cond_wait + 1204
> frame #2: 0x00000001000c6688
> sync.test`runtime.pthread_cond_wait_trampoline.abi0 + 24
> frame #3: 0x00000001000c4838 sync.test`runtime.asmcgocall.abi0 + 200
> thread #5
> frame #0: 0x00000001945e66ec libsystem_kernel.dylib`__psynch_cvwait + 8
> frame #1: 0x0000000194624894
> libsystem_pthread.dylib`_pthread_cond_wait + 1204
> frame #2: 0x00000001000c6688
> sync.test`runtime.pthread_cond_wait_trampoline.abi0 + 24
> frame #3: 0x00000001000c4838 sync.test`runtime.asmcgocall.abi0 + 200
> thread #6
> frame #0: 0x00000001945e66ec libsystem_kernel.dylib`__psynch_cvwait + 8
> frame #1: 0x0000000194624894
> libsystem_pthread.dylib`_pthread_cond_wait + 1204
> frame #2: 0x00000001000c6688
> sync.test`runtime.pthread_cond_wait_trampoline.abi0 + 24
> frame #3: 0x00000001000c4838 sync.test`runtime.asmcgocall.abi0 + 200
>
>
>
> (dlv) continue
> > [runtime-fatal-throw] runtime.fatalsignal()
> /usr/local/go/src/runtime/signal_unix.go:831 (hits goroutine(1):1 total:1)
> (PC: 0x104f027bc)
> Warning: debugging optimized function
> 826: printDebugLog()
> 827:
> 828: exit(2)
> 829: }
> 830:
> => 831: func fatalsignal(sig uint32, c *sigctxt, gp *g, mp *m) *g {
> 832: if sig < uint32(len(sigtable)) {
> 833: print(sigtable[sig].name, "\n")
> 834: } else {
> 835: print("Signal ", sig, "\n")
> 836: }
> (dlv) stack
> 0 0x0000000104f027bc in runtime.fatalsignal
> at /usr/local/go/src/runtime/signal_unix.go:831
> 1 0x0000000104f02390 in runtime.sighandler
> at /usr/local/go/src/runtime/signal_unix.go:754
> 2 0x0000000104f01cac in runtime.sigtrampgo
> at /usr/local/go/src/runtime/signal_unix.go:490
> 3 0x0000000104e6c23c in ???
> at ?:-1
> 4 0x0000000106569974 in
> github.com/ebitengine/purego/internal/fakecgo.x_cgo_notify_runtime_init_done
> at /Users/jdoak/go/pkg/mod/
> github.com/ebitengine/[email protected]/internal/fakecgo/go_libinit.go:22
> <http://github.com/ebitengine/[email protected]/internal/fakecgo/go_libinit.go:22>
> 5 0x000000016af95d88 in ???
> at ?:-1
> 6 0x0000000104f2cadc in runtime.asmcgocall
> at /usr/local/go/src/runtime/asm_arm64.s:1000
> 7 0x0000000104f2daa8 in racecall
> at /usr/local/go/src/runtime/race_arm64.s:476
> 8 0x0000000000000000 in ???
> at :0
> error: NULL address
> (truncated)
>
> On Wednesday, January 15, 2025 at 9:41:47 PM UTC-8 Kurtis Rader wrote:
>
>> On Wed, Jan 15, 2025 at 8:31 PM John <[email protected]> wrote:
>>
>>> Hey Kurtis,
>>>
>>> Thanks for responding.
>>>
>>> Unfortunately, this does look like some type of OTEL problem. I was
>>> able to make a copy and strip out all the OTEL code. As soon as I did
>>> this, this stopped happening. Which means it is some type of OTEL issue
>>> that I should probably track down with the OTEL people.
>>>
>>> As a note for someone who stumbles on this with a similar problem, the
>>> OTEL packages included:
>>>
>>> "go.opentelemetry.io/otel/attribute"
>>> "go.opentelemetry.io/otel/trace"
>>> "go.opentelemetry.io/otel/metric"
>>>
>>> These packages are at v1.33.0
>>>
>>
>> Note that simply removing the references to the above mentioned OTEL
>> package does not guarantee the problem is with that package. The failure
>> could still be due to how you are using the package. Having said that, any
>> public package should validate its inputs and provide a more meaningful
>> failure than a SIGSEGV fault. So even if the proximate cause of the failure
>> is a mistake in your code there is clearly room for improvement in the
>> package you are using.
>>
>> As a retired software support engineer who has spent thousands of hours
>> debugging these types of problems I can't stress how important it is to
>> create a minimal reproducible example as the quickest way to get to the
>> root cause of the problem. A minimal reproducible example will allow
>> others, such as the OTEL package maintainers, to employ tools, such as gdb
>> or lldb, which you may not be comfortable using.
>>
>> --
>> Kurtis Rader
>> Caretaker of the exceptional canines Junior and Hank
>>
>
--
You received this message because you are subscribed to the Google Groups
"golang-nuts" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion visit
https://groups.google.com/d/msgid/golang-nuts/d899fb29-2c7c-4983-9947-7e7fbfa65cb6n%40googlegroups.com.