Alexey Serbin has submitted this change and it was merged. (
http://gerrit.cloudera.org:8080/23464 )
Change subject: [thirdparty] Patch breakpad to fix minidump stacks for 64KB
pages
......................................................................
[thirdparty] Patch breakpad to fix minidump stacks for 64KB pages
Merged breakpad fix from Impala.
Testing:
I started a RHEL 9.6 VM and updated the core to 64KB pages.
Built Kudu and caused it to crash. The minidump was missing
stacks. After applying the patch and crashing a tserver
again, crashdump can be read with gdb (after calling
minidump-2-core on it).
Original Impala message:
IMPALA-13472: Patch breakpad to fix minidump stacks for 64KB pages
When resolving minidumps from Redhat8 ARM machines, we found
that the stack traces were truncated, with only the top frame
available. The minidump_stackwalk output included this error
message:
minidump.cc:1391: INFO: MinidumpMemoryRegion request out of range:
0xfffee0aadf30+8/0xfffee0aa0000+0x8000
stackwalker_arm64.cc:dc: ERROR: Unable to read caller_fp from last_fp:
0xfffee0aadf30
The 0x8000 is 32KB, so the minidump contains 32KB of stack, but
that 32KB doesn't cover the memory pointed to by the stack pointer
register.
This is due to this logic in Breakpad's LinuxDumper::GetStackInfo():
// Move the stack pointer to the bottom of the page that it's in.
const uintptr_t page_size = getpagesize();
uint8_t* const stack_pointer =
reinterpret_cast<uint8_t*>(int_stack_pointer & ~(page_size - 1));
When Breakpad moves the stack pointer to the bottom of the page,
it is including unused stack memory (as the stack is top to bottom).
With a 64KB page size, since it only copies 32KB of the stack memory,
this memory usually doesn't contain any used stack memory. This
lines up with the observations from trying to process the minidump.
It is unclear why Breakpad moves the stack pointer to the bottom
of the page to include unused stack memory. There is a 128 byte
red zone on x86_64 which leaf functions can use without changing
the stack pointer, but that does not require the whole page (and
the rounding logic would not include it when near the boundary).
This changes the logic to explicitly include the 128 byte red
zone (respecting the boundary of the memory region). It then
rounds off the stack pointer to a 1KB boundary rather than a
whole OS page. This may not be necessary, but it is mostly
harmless and can be optimized more later.
Testing:
- Linked Impala with the modified Breakpad, sent SIGUSR1 to produce
a minidump, then resolved the minidump. ARM64 with 64KB pages
produces stack traces. x86_64 also continues to work.
Change-Id: I5a512e65618063ab78c7ec993a24ba11df724d5f
Reviewed-on: http://gerrit.cloudera.org:8080/23463
Reviewed-by: Zoltan Chovan <[email protected]>
Tested-by: Alexey Serbin <[email protected]>
Reviewed-by: Alexey Serbin <[email protected]>
(cherry picked from commit 8d453f03ab08d5fec0a287cd8cd3927158c168fb)
Reviewed-on: http://gerrit.cloudera.org:8080/23464
Reviewed-by: Abhishek Chennaka <[email protected]>
---
M thirdparty/download-thirdparty.sh
A thirdparty/patches/breakpad-64k-pages-stack-collection.patch
2 files changed, 48 insertions(+), 2 deletions(-)
Approvals:
Abhishek Chennaka: Looks good to me, approved
Alexey Serbin: Verified
--
To view, visit http://gerrit.cloudera.org:8080/23464
To unsubscribe, visit http://gerrit.cloudera.org:8080/settings
Gerrit-Project: kudu
Gerrit-Branch: branch-1.18.x
Gerrit-MessageType: merged
Gerrit-Change-Id: I5a512e65618063ab78c7ec993a24ba11df724d5f
Gerrit-Change-Number: 23464
Gerrit-PatchSet: 2
Gerrit-Owner: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Abhishek Chennaka <[email protected]>
Gerrit-Reviewer: Alexey Serbin <[email protected]>
Gerrit-Reviewer: Kudu Jenkins (120)
Gerrit-Reviewer: Zoltan Martonka <[email protected]>