https://gcc.gnu.org/bugzilla/show_bug.cgi?id=105373
Bug ID: 105373 Summary: miscompile involving lambda coroutines and an object bitwise copied instead of via the copy constructor Product: gcc Version: 11.3.1 Status: UNCONFIRMED Severity: normal Priority: P3 Component: c++ Assignee: unassigned at gcc dot gnu.org Reporter: avi at scylladb dot com Target Milestone: --- This is a bug in a complex piece of code, so I'll need guidance on what further information to provide (e.g. intermediate code dumps). It reproduces with various levels of debug information, and the code works with clang. At the heart there is a shared pointer, which I've enhanced so that all pointers that point to the same object keep track of each other (also the pointee points back to one of the pointers). The pointee keeps an incrementing generation count. In the end I have two pointer objects that are bitwise equal, even though they should have different generation counts and different doubly-linked-list pointers. This proves that a bitwise copy happened. It doesn't prove a miscompile, since my code could have decided to perform a bitwise copy, but the fact that it works in clang indicates it's a gcc bug. The bug happens with asan too, and with different sizeof(the smart pointer) so it's not some stray write. This is the snippet that causes the trouble: 0 tlogger.info("before updating cache {}", fmt::ptr(old3.get())); 1 co_await with_scheduling_group(_config.memtable_to_cache_scheduling_group, [this, old4 = old3, &newtabs] () -> future<> { 2 tlogger.info("updating cache {}", fmt::ptr(old4.get())); 3 return update_cache(old4, newtabs); 4 }); old3 and old4 are all copies of the same smart pointer (there are also old and old1 and old2 elsewhere, but they are correct). In the call to update_cache(), we attempt to make a copy of old4, and an internal check finds the link list is corrupted. Inspecting old4 in the debugger (from the printout in line 0) and the source of the copy in line 3 shows they are the same, but have different addresses: 0 INFO 2022-04-25 12:04:14,722 [shard 0] table - before updating cache 0x600000666c40 1 copying @0x60000520bef8 <- @0x6000052108f0 0x600000666c40 refcnt 6 gen 43 2 INFO 2022-04-25 12:04:14,722 [shard 0] table - updating cache 0x600000666c40 3 (gdb) p this $1 = (const seastar::lw_shared_ptr<replica::memtable> * const) 0x60000520bed0 4 (gdb) p *this 5 $2 = {_ptr = 0x600000666c40, _next = 0x6000052108f0, _prev = 0x60000597ae48, _generation = 43} 6 (gdb) x/4gx 0x60000520bef8 7 0x60000520bef8: 0x0000600000666c40 0x00006000052108f0 8 0x60000520bf08: 0x000060000597ae48 0x000000000000002b 9 (gdb) x/4gx this 10 0x60000520bed0: 0x0000600000666c40 0x00006000052108f0 11 0x60000520bee0: 0x000060000597ae48 0x000000000000002b in line 3 I dump old4 (from the printout in line 1, where old3 is copied into old4). But "this" doesn't point to old4., it points to a bitwise copy of old4, as shown in lines 7-8 and 10-11. Note that both old4 and the bad copy "this" are members of the coroutine frame. I realize this isn't enough to analyze the situation, I'm happy to provide more information if you direct me how.