Package: lprng Version: 3.8.B-2.1 Severity: normal Dear Maintainer,
* What led up to the situation? Upgraded printer server to Stretch. After an hour, all printing failed. Restarted lprng, it worked for a while, then all printing failed. * What exactly did you do (or not do) that was effective (or ineffective)? Printed, then invalidated cache, then printed again. * What was the outcome of this action? No paper. * What outcome did you expect instead? Paper. Necessary components to trigger this bug: 1) lprng 2) possibly ifhp, not entirely sure 3) sssd >=1.13, with fast memory cache support for initgroups (that lprng uses) 4) nss using sss 5) cache flushed (sss_cache -xxx) or sssd restarted This was a tricky one. TL;DR, sssd's newly implemented fast memory cache for initgroups combined with "simulated close_on_exec" in lpd and an sssd restart / cache invalidation causes lpd and sssd to stomp all over each others file descriptors, closing them for each other (in the bad way), causing printing to fail. More verbose (some details/orders are possibly wrong, but the general sense is what seems to happen after comparing source + strace's): sssd has something called fast memory cache, lightly described at https://jhrozek.wordpress.com/2015/03/11/anatomy-of-sssd-user-lookup/ (the orange box, #2) meaning sssd opens files within the client process without client knowing about it. In lprng/src/common/lpd_worker.c, in Make_lpd_call, we have: /* close other ones to simulate close_on_exec() */ n = Max_fd+10; for( i = passfd_count ; i < n; ++i ){ close(i); } In sssd/src/sss_client/nss_mc_common.c, in sss_nss_mc_destroy_ctx, we have: if (ctx->fd != -1) { close(ctx->fd); } What normally happens, roughly: lpd starts and does some nss work, nss_sss opens an extra file to /var/lib/sss/mc/initgroups inside lpd (without lpd knowing), then it's time for lpd to fork and closes that initgroups file (without nss_sss noticing) due to simulated close_on_exec. Then starts calling ifhp, making network connections etc. Sockets are created, file descriptors are dup2()'d to the left and right and then data is transmitted to the printer. All is well. If I then call sss_cache -E/-somethingelse or restart sssd, invalidating the fast memory cache - meaning /var/lib/sss/mc/initgroups is deleted and try again.. Then after the fork and after connections to the printer are made, nss notices that the cache is invalid and munmap()'s it and closes the filedescriptor it used to have. Unfortunately, that fd now belongs to the printer network socket due to number reuse. New unix sockets are created with nss calls, getting into that fd instead. lpd wants to send data to the printer, but instead it goes to /var/lib/sss/pipes/nss - which does not like PJL as much as printers do. That socket does not respond PJL:y and the network socket is closed and dismissed with an error. Solution: Make lpd actually use close on exec and/or only close the file descriptors it has actually created itself. This is probably some work and this bug probably affects very few, but I did want to get it documented. Workaround: # cat /etc/systemd/system/lprng.service.d/sssd_memcache.conf [Service] Environment=SSS_NSS_USE_MEMCACHE=NO -- System Information: Debian Release: 9.2 APT prefers stable-updates APT policy: (500, 'stable-updates'), (500, 'stable') Architecture: amd64 (x86_64) Foreign Architectures: i386 Kernel: Linux 4.9.0-4-amd64 (SMP w/4 CPU cores) Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), LANGUAGE=en_US.UTF-8 (charmap=UTF-8) Shell: /bin/sh linked to /bin/bash Init: systemd (via /run/systemd/system) Versions of packages lprng depends on: ii debconf [debconf-2.0] 1.5.61 ii libc6 2.24-11+deb9u1 ii libcomerr2 1.43.4-2 ii libk5crypto3 1.15-1+deb9u1 ii libkrb5-3 1.15-1+deb9u1 ii libssl1.1 1.1.0f-3+deb9u1 ii lsb-base 9.20161125 lprng recommends no packages. Versions of packages lprng suggests: pn lprng-doc <none> pn magicfilter <none> -- Configuration Files: /etc/lprng/lpd.conf changed: default_printer=NOPRINTER default_remote_host=printserver.cs.umu.se lf=/var/log/lprng.log lpd_printcap_path= /var/conf/lprng/lpd_printcap mail_from=Print master<r...@cs.umu.se> max_accounting_file_size=0 max_log_file_size=0 max_servers_active=128 mc=200 minfree=50000 mx=1000000 printcap_path= /var/conf/lprng/printcap sendmail=/usr/lib/sendmail -oi -t user_lpc=hold,release,move,abort,kill -- debconf information excluded