2017-12-06 15:52 GMT+01:00 George Joseph <[email protected]>: > > > On Tue, Dec 5, 2017 at 9:20 AM, Olivier <[email protected]> wrote: > >> Hello, >> >> I carefully read [1] which details how backtrace files can be produced. >> >> Maybe this seems natural to some, but how can I go one step futher, and >> check that produced XXX-thread1.txt, XXX-brief.txt, ... files are OK ? >> >> In other words, where can I find an example on how to use one of those >> files and check by myself, that if a system ever fails, I won't have to >> wait for another failure to provide required data to support teams ? >> > > It's a great question but I could spend a week answering it and not > scratch the surface. :) >
Thanks very much for trying, anyway ;-) > It's not a straightforward thing unless you know the code in question. > The most common is a segmentation fault (segfault or SEGV). > True ! I experienced segfaults lately and I could not configure the platform I used then (Debian Jessie) to produce core files in a directory Asterisk can write into. Now, with Debian Stretch, I can produce core file at will (with a kill -s SIGSEGV <processid>). I checked ast_coredumped worked OK as it produced thread.txt files and so on. Ideally, I would like to go one step further: check now that a future .txt file would be "workable" (and not "you should have compiled with option XXX or configured with option YYY) . > In that case, the thread1.txt file is the place to start. Since most of > the objects passed around are really pointers to objects, the most obvious > cause would be a 0x0 for a value. So for instance "chan=0x0". That would > be a pointer to a channel object that was not set when it probably should > have been. Unfortunately, it's not only 0x0 that could cause a segv. > Anytime a program tries to access memory it doesn't own, that signal is > raised. So let's say there a 256 byte buffer which the process owns. If > there's a bug somewhere that causes the program to try and access bytes > beyond the end of the buffer, you MAY get a segv if that process doesn't > also own that memory. If this case, the backtrace won't show anything > obvious because the pointers all look valid. There probably would be an > index variable (i or ix, etc) that may be set to 257 but you'd have to know > that the buffer was only 256 bytes to realize that that was the issue. > So, with an artificial kill -s SIGSEGV <processid>, does the bellow output prove I have a workable .txt files (having .txt files that let people find the root cause of the issue is another story as we probably can only hope for the best here) ? # head core-brief.txt !@!@!@! brief.txt !@!@!@! Thread 38 (Thread 0x7f2aa5dd0700 (LWP 992)): #0 pthread_cond_timedwait@@GLIBC_2.3.2 () at ../sysdeps/unix/sysv/linux/x86_64/pthread_cond_timedwait.S:225 #1 0x000055cdcb69ae84 in __ast_cond_timedwait (filename=0x55cdcb7d4910 "threadpool.c", lineno=1131, func=0x55cdcb7d4ea8 <__PRETTY_FUNCTION__.8978> "worker_idle", cond_name=0x55cdcb7d4b7f "&worker->cond", mutex_name=0x55cdcb7d4b71 "&worker->lock", cond=0x7f2abc000978, t=0x7f2abc0009a8, abstime=0x7f2aa5dcfc30) at lock.c:668 #2 0x000055cdcb75d153 in worker_idle (worker=0x7f2abc000970) at threadpool.c:1131 #3 0x000055cdcb75ce61 in worker_start (arg=0x7f2abc000970) at threadpool.c:1022 #4 0x000055cdcb769a8c in dummy_start (data=0x7f2abc000a80) at utils.c:1238 #5 0x00007f2aeddad494 in start_thread (arg=0x7f2aa5dd0700) at pthread_create.c:333 > Deadlocks are even harder to troubleshoot. For that, you need to look at > full.txt to see where the threads are stuck and find the 1 thread that's > holding the lock that the others are stuck on. > > Sorry. I wish I had a better answer because it'd help a lot if folks > could do more investigation themselves. > > > > > >> >> Best regards >> >> [1] https://wiki.asterisk.org/wiki/display/AST/Getting+a+Backtrace >> >> -- >> _____________________________________________________________________ >> -- Bandwidth and Colocation Provided by http://www.api-digital.com -- >> >> Check out the new Asterisk community forum at: >> https://community.asterisk.org/ >> >> New to Asterisk? Start here: >> https://wiki.asterisk.org/wiki/display/AST/Getting+Started >> >> asterisk-users mailing list >> To UNSUBSCRIBE or update options visit: >> http://lists.digium.com/mailman/listinfo/asterisk-users >> > > > > -- > George Joseph > Digium, Inc. | Software Developer > 445 Jan Davis Drive NW - Huntsville, AL 35806 - US > Check us out at: www.digium.com & www.asterisk.org > > > -- > _____________________________________________________________________ > -- Bandwidth and Colocation Provided by http://www.api-digital.com -- > > Check out the new Asterisk community forum at: https://community.asterisk. > org/ > > New to Asterisk? Start here: > https://wiki.asterisk.org/wiki/display/AST/Getting+Started > > asterisk-users mailing list > To UNSUBSCRIBE or update options visit: > http://lists.digium.com/mailman/listinfo/asterisk-users >
-- _____________________________________________________________________ -- Bandwidth and Colocation Provided by http://www.api-digital.com -- Check out the new Asterisk community forum at: https://community.asterisk.org/ New to Asterisk? Start here: https://wiki.asterisk.org/wiki/display/AST/Getting+Started asterisk-users mailing list To UNSUBSCRIBE or update options visit: http://lists.digium.com/mailman/listinfo/asterisk-users
