Package: lprng
Version: 3.8.B-2.1
Severity: normal

Dear Maintainer,

   * What led up to the situation?

Upgraded printer server to Stretch. After an hour, all printing failed.
Restarted lprng, it worked for a while, then all printing failed.

   * What exactly did you do (or not do) that was effective (or
     ineffective)?

Printed, then invalidated cache, then printed again.

   * What was the outcome of this action?

No paper.

   * What outcome did you expect instead?

Paper.


Necessary components to trigger this bug:
1) lprng
2) possibly ifhp, not entirely sure
3) sssd >=1.13, with fast memory cache support for initgroups (that
   lprng uses)
4) nss using sss
5) cache flushed (sss_cache -xxx) or sssd restarted



This was a tricky one.

TL;DR, sssd's newly implemented fast memory cache for initgroups
combined with "simulated close_on_exec" in lpd and an sssd restart /
cache invalidation causes lpd and sssd to stomp all over each others
file descriptors, closing them for each other (in the bad way), causing
printing to fail.

More verbose (some details/orders are possibly wrong, but the general
sense is what seems to happen after comparing source + strace's):

sssd has something called fast memory cache, lightly described at
https://jhrozek.wordpress.com/2015/03/11/anatomy-of-sssd-user-lookup/
(the orange box, #2) meaning sssd opens files within the client process
without client knowing about it.


In lprng/src/common/lpd_worker.c, in Make_lpd_call, we have:
        /* close other ones to simulate close_on_exec() */
        n = Max_fd+10;
        for( i = passfd_count ; i < n; ++i ){
                close(i);
        }

In sssd/src/sss_client/nss_mc_common.c, in sss_nss_mc_destroy_ctx, we have:
    if (ctx->fd != -1) {
        close(ctx->fd);
    }

What normally happens, roughly:
lpd starts and does some nss work, nss_sss opens an extra file to
/var/lib/sss/mc/initgroups inside lpd (without lpd knowing), then it's
time for lpd to fork and closes that initgroups file (without nss_sss
noticing) due to simulated close_on_exec. Then starts calling ifhp,
making network connections etc. Sockets are created, file descriptors
are dup2()'d to the left and right and then data is transmitted to the
printer. All is well.

If I then call sss_cache -E/-somethingelse or restart sssd, invalidating
the fast memory cache - meaning /var/lib/sss/mc/initgroups is deleted
and try again..

Then after the fork and after connections to the printer are made, nss
notices that the cache is invalid and munmap()'s it and closes the
filedescriptor it used to have. Unfortunately, that fd now belongs to
the printer network socket due to number reuse. New unix sockets are
created with nss calls, getting into that fd instead. lpd wants to send
data to the printer, but instead it goes to /var/lib/sss/pipes/nss -
which does not like PJL as much as printers do. That socket does not
respond PJL:y and the network socket is closed and dismissed with an
error.


Solution:
Make lpd actually use close on exec and/or only close the file
descriptors it has actually created itself. This is probably some work
and this bug probably affects very few, but I did want to get it
documented.


Workaround:

# cat /etc/systemd/system/lprng.service.d/sssd_memcache.conf
[Service]
Environment=SSS_NSS_USE_MEMCACHE=NO



-- System Information:
Debian Release: 9.2
  APT prefers stable-updates
  APT policy: (500, 'stable-updates'), (500, 'stable')
Architecture: amd64 (x86_64)
Foreign Architectures: i386

Kernel: Linux 4.9.0-4-amd64 (SMP w/4 CPU cores)
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8), 
LANGUAGE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash
Init: systemd (via /run/systemd/system)

Versions of packages lprng depends on:
ii  debconf [debconf-2.0]  1.5.61
ii  libc6                  2.24-11+deb9u1
ii  libcomerr2             1.43.4-2
ii  libk5crypto3           1.15-1+deb9u1
ii  libkrb5-3              1.15-1+deb9u1
ii  libssl1.1              1.1.0f-3+deb9u1
ii  lsb-base               9.20161125

lprng recommends no packages.

Versions of packages lprng suggests:
pn  lprng-doc    <none>
pn  magicfilter  <none>

-- Configuration Files:
/etc/lprng/lpd.conf changed:
default_printer=NOPRINTER
default_remote_host=printserver.cs.umu.se
lf=/var/log/lprng.log
lpd_printcap_path= /var/conf/lprng/lpd_printcap
mail_from=Print master<r...@cs.umu.se>
max_accounting_file_size=0
max_log_file_size=0
max_servers_active=128
mc=200
minfree=50000
mx=1000000
printcap_path= /var/conf/lprng/printcap
sendmail=/usr/lib/sendmail -oi -t
user_lpc=hold,release,move,abort,kill


-- debconf information excluded

Reply via email to