Hi guys.

One of the masters started recently to find SSSD dead and says the killer is the WATCHDOG - but I'm not sure about that.
From sssd.log:
...
********************** BACKTRACE DUMP ENDS HERE *********************************

(2022-07-21  7:11:01): [sssd] [svc_child_info] (0x0020): Child [991] ('pac':'pac') was terminated by own WATCHDOG
   *  ... skipping repetitive backtrace ...
(2022-07-21  7:11:14): [sssd] [svc_child_info] (0x0020): Child [984] ('abba.xx.priv.yy':'%BE_abba.xx.priv.yy') was terminated by own WATCHDOG
   *  ... skipping repetitive backtrace ...
(2022-07-21  7:11:14): [sssd] [svc_child_info] (0x0040): Child [9744] ('nss':'nss') exited with code [3] ********************** PREVIOUS MESSAGE WAS TRIGGERED BY THE FOLLOWING BACKTRACE:    *  (2022-07-21  7:11:14): [sssd] [sbus_dispatch_reconnect] (0x0400): Connection lost. Terminating active requests.    *  (2022-07-21  7:11:14): [sssd] [sbus_dispatch_reconnect] (0x4000): Remote client terminated the connection. Releasing data...    *  (2022-07-21  7:11:14): [sssd] [sbus_connection_free] (0x4000): Connection 0x5576314d9180 will be freed during next loop!    *  (2022-07-21  7:11:14): [sssd] [mt_svc_restart] (0x0400): Scheduling service abba.xx.priv.yy for restart 1    *  (2022-07-21  7:11:14): [sssd] [get_provider_config] (0x0100): Formed command '/usr/libexec/sssd/sssd_be --domain abba.xx.priv.yy --uid 0 --gid 0 --logger=files' for provider '%BE_abba.xx.priv.yy'    *  (2022-07-21  7:11:14): [sssd] [start_service] (0x0100): Queueing service abba.xx.priv.yy for startup    *  (2022-07-21  7:11:14): [sssd] [mt_svc_exit_handler] (0x1000): SIGCHLD handler of service nss called    *  (2022-07-21  7:11:14): [sssd] [svc_child_info] (0x0040): Child [9744] ('nss':'nss') exited with code [3] ********************** BACKTRACE DUMP ENDS HERE *********************************

(2022-07-21  7:11:14): [sssd] [svc_child_info] (0x0040): Child [9758] ('pac':'pac') exited with code [3]
   *  ... skipping repetitive backtrace ...
(2022-07-21  7:11:16): [sssd] [svc_child_info] (0x0040): Child [9876] ('nss':'nss') exited with code [3]
   *  ... skipping repetitive backtrace ...
(2022-07-21  7:11:16): [sssd] [svc_child_info] (0x0040): Child [9877] ('pac':'pac') exited with code [3]
   *  ... skipping repetitive backtrace ...
(2022-07-21  7:11:20): [sssd] [svc_child_info] (0x0040): Child [9903] ('nss':'nss') exited with code [3]
   *  ... skipping repetitive backtrace ...
(2022-07-21  7:11:20): [sssd] [monitor_restart_service] (0x0010): Process [nss], definitely stopped! (2022-07-21  7:11:20): [sssd] [monitor_quit] (0x3f7c0): Returned with: 1 (2022-07-21  7:11:20): [sssd] [monitor_quit] (0x3f7c0): Terminating [pac][9904] (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Child [pac] terminated with a signal (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Terminating [abba.xx.priv.yy][9875] (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Child [abba.xx.priv.yy] exited gracefully (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Terminating [sudo][990] (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Child [sudo] exited gracefully (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Terminating [ssh][989] (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Child [ssh] exited gracefully (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Terminating [ifp][988] (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Child [ifp] exited gracefully (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Terminating [pam][987] (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Child [pam] exited gracefully (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Terminating [implicit_files][983] (2022-07-21  7:11:21): [sssd] [monitor_quit] (0x3f7c0): Child [implicit_files] exited gracefully

This "death" happens randomly, well, to me at least. Can be just after reboot or several hours of uptime. There is more in log files from /var/log/sssd but before I clutter emails with more logs snippets I was hoping some expert can share some thoughts.

many thanks, L.
_______________________________________________
FreeIPA-users mailing list -- [email protected]
To unsubscribe send an email to [email protected]
Fedora Code of Conduct: 
https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: 
https://lists.fedorahosted.org/archives/list/[email protected]
Do not reply to spam on the list, report it: 
https://pagure.io/fedora-infrastructure

Reply via email to