On Mon, 11 Jul 2016 at 11:07:07 +0200, Yves-Alexis Perez wrote: > > Misc error when trying to watch fd 4: Invalid argument > > Unable to add reload watch to main loopthis watch should have been > > invalidatedMisc error when trying to watch fd 5: Invalid argument > > Unable to add reload watch to main loop: (null) > > The message above seems to come from socket_set_epoll_add (https://dbus.freede > sktop.org/doc/api/html/dbus-socket-set-epoll_8c_source.html#l00137) where > epoll_ctl sets errno to EINVAL which means self->epfd is not an epoll file > descriptor.
Something like that, yes. I'm really confused by this: it looks like it might be a dbus-daemon bug, but so far I can't see why it would happen, or how upgrading lightdm could trigger it. The sequence of events here is: * We create the epoll fd in _dbus_socket_set_epoll_new(). To have got a non-NULL result, we must have epfd != -1; and if we'd got NULL, we'd have failed already (_dbus_socket_set_new, _dbus_loop_new, bus_context_new and main all seem to catch errors correctly). * dbus-daemon (bus/dir-watch-inotify.c) creates an inotify fd (fd 4) but then fails to add it to the main loop: _dbus_loop_add_watch() results in socket_set_epoll_add() which fails with EINVAL. Watching directories is considered non-critical so this is ignored. * dbus-daemon (bus/main.c) creates what should probably have been a pipe-to-self to catch SIGHUP and SIGTERM from its own signal handler, but for historical reasons it's a socketpair() instead. The end we use as the read end is fd 5. socket_set_epoll_add() fails again. Catching our own SIGHUP and SIGTERM is considered to be critical (and really shouldn't fail!) so we exit. I wonder whether there are other reasons why epoll_ctl can report EINVAL? I also wonder whether the new lightdm is starting dbus-launch with a different value for some arbitrary kernel limit, or whether your previous session leaked some fds resulting in dbus-launch coming up with 90% of an arbitrary limit already in use, or something like that? The error reporting is definitely also bad here - unfortunately we're constrained by 2005-era code that is partly designed for "autolaunching" (spawning a dbus-daemon on-demand when you run a single GNOME/KDE app in a twm session without proper distro D-Bus integration), which is why stderr gets silenced. The autolaunch code path and the "proper distro integration" code path both use dbus-launch, unless you're living in the dbus-user-session future. S