I did another round of debugging, and have some new findings to report.

To start with, I put a breakpoint on OnNoMemoryInternal(). That works
better than trying to catch the SIGILL. However, this failure mode has
been relatively infrequent with my modified 126.0.6478.126 build.

More common lately has been a straight segfault related to Mojo that
invariably brings down the entire browser. Here is a typical example:

    Thread 1 "chromium" received signal SIGSEGV, Segmentation fault.
    [Switching to Thread 0x7f92e4ff8480 (LWP 876682)]
    0x000055c72dfed9c2 in 
mojo::InterfaceEndpointClient::HandleIncomingMessageThunk::Accept(mojo::Message*)
 ()
    (gdb) bt
    #0  0x000055c72dfed9c2 in 
mojo::InterfaceEndpointClient::HandleIncomingMessageThunk::Accept(mojo::Message*)
 ()
    #1  0x000055c72dff5314 in mojo::MessageDispatcher::Accept(mojo::Message*) ()
    #2  0x000055c72dfefbed in 
mojo::InterfaceEndpointClient::HandleIncomingMessage(mojo::Message*) ()
    #3  0x000055c72dff9496 in 
mojo::internal::MultiplexRouter::ProcessIncomingMessage(mojo::internal::MultiplexRouter::MessageWrapper*,
 mojo::internal::MultiplexRouter::ClientCallBehavior, 
base::SequencedTaskRunner*) ()
    #4  0x000055c72dff8c73 in 
mojo::internal::MultiplexRouter::Accept(mojo::Message*) ()
    #5  0x000055c72dff5314 in mojo::MessageDispatcher::Accept(mojo::Message*) ()
    #6  0x000055c72dfeb74e in 
mojo::Connector::DispatchMessage(mojo::ScopedHandleBase<mojo::MessageHandle>) ()
    #7  0x000055c72dfebeda in mojo::Connector::ReadAllAvailableMessages() ()
    #8  0x000055c72d7e03ff in 
base::TaskAnnotator::RunTaskImpl(base::PendingTask&)
        ()
    #9  0x000055c72d801667 in 
base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::DoWork() 
()
    #10 0x000055c72d873e6a in base::(anonymous 
namespace)::WorkSourceDispatch(_GSource*, int (*)(void*), void*) ()
    #11 0x00007f92e80b97a9 in g_main_context_dispatch ()
       from /lib/x86_64-linux-gnu/libglib-2.0.so.0
    #12 0x00007f92e80b9a38 in ?? () from /lib/x86_64-linux-gnu/libglib-2.0.so.0
    #13 0x00007f92e80b9acc in g_main_context_iteration ()
       from /lib/x86_64-linux-gnu/libglib-2.0.so.0
    #14 0x000055c72d872c00 in 
base::MessagePumpGlib::Run(base::MessagePump::Delegate*) ()
    #15 0x000055c72d802190 in 
base::sequence_manager::internal::ThreadControllerWithMessagePumpImpl::Run(bool,
 base::TimeDelta) ()
    #16 0x000055c72d7c18a9 in base::RunLoop::Run(base::Location const&) ()
    #17 0x000055c72adc4fda in content::BrowserMainLoop::RunMainMessageLoop() ()
    #18 0x000055c72adc114d in content::BrowserMain(content::MainFunctionParams) 
()
    #19 0x000055c72ca25b49 in 
content::ContentMainRunnerImpl::RunBrowser(content::MainFunctionParams, bool) ()
    #20 0x000055c72ca25637 in content::ContentMainRunnerImpl::Run() ()
    #21 0x000055c72ca221a4 in content::ContentMain(content::ContentMainParams) 
()
    #22 0x000055c7284eafc5 in ChromeMain ()
    #23 0x00007f92e644624a in __libc_start_call_main (
        main=main@entry=0x55c7284eac60 <main>, argc=argc@entry=8, 
        argv=0x7ffdc854dc48, argv@entry=0xec0002740e0)
        at ../sysdeps/nptl/libc_start_call_main.h:58
    #24 0x00007f92e6446305 in __libc_start_main_impl (main=0x55c7284eac60 
<main>, 
        argc=8, argv=0xec0002740e0, init=<optimized out>, fini=<optimized out>, 
        rtld_fini=<optimized out>, stack_end=0x7ffdc854dc38)
        at ../csu/libc-start.c:360
    #25 0x000055c728167021 in _start ()

Right before, I'll get a message on the terminal like

    [885352:885352:0709/095917.737560:ERROR:interface_endpoint_client.cc(722)] 
Message 0 rejected by interface viz.mojom.Gpu

    [890042:890062:0709/222611.345773:ERROR:interface_endpoint_client.cc(722)] 
Message 0 rejected by interface blink.mojom.Blob

I suspect this is a bug that is being tickled by the memory pressure
(because otherwise everyone would be complaining about a crashing
browser). Could use some guidance on what to poke in GDB/chromium to get
some useful information out.

One other oddity I've noticed: I often keep the browser's Task Manager
window running on the side. I've noticed numerous cases where the
"Browser" process's "Memory footprint" column hovers around ~350 MB,
then spikes to ~850 MB for several seconds, then drops back down to
~350. This is with no visible activity in the browser that could explain
it, like the loading of a new page. Memory allocations break very easily
while this stat is elevated, as you'd expect.

Reply via email to