Package: collectd
Version: 4.6.3-1
Severity: normal

In the default configuration, collectd's df plugin will collect
statistics for all devices, including NFS mounts and other network
filesystems.

If a laptop has collectd on it and is suspended via pm-suspend with some
NFS mounts, the following often happens:

- Network is taken down.
- collectd wakes up and calls statfs(2) on each mounted filesystem
- Hangs on the NFS mount, since the network is down.
- The kernel fails to suspend collectd, and the sleep fails.
- Accidentially still powered on laptop melts in its bag or causes
  plane to crash into Hudson river. ;-)

When this happens, I see this in dmesg:

[12127.392595] runnable tasks:
[12127.392597]             task   PID         tree-key  switches  prio     
exec-runtime         sum-exec        sum-sleep
[12127.392600] 
----------------------------------------------------------------------------------------------------------
[12127.392669] R     pm-suspend 17121   1380454.770503        94   120          
     0               0               0.000000               0.000000            
   0.000000 /
[12127.392684] 
[12127.392720]  collectd
[12127.392814] 
[12127.392817] Restarting tasks ... done.

Unmounting the network filesystems before suspend is probably wise,
but it's easy to forget to do that and sometimes you just want to suspend
with a network filesystem mounted, since remounting it on resume would
be hard (ie, might need a password for sshfs).

So this is, arguably, a bug in collectd. It would be much
nicer if its default config could avoid accessing network filesystems.
IMHO.

Less IMHO, if I configure collectd like this:

# Only list local devices, no network stuff.
<Plugin df>
        Device "/dev/hda1"
        IgnoreSelected false
</Plugin>

It still does this.

[pid 18790] statfs64("/", 84, {f_type="EXT2_SUPER_MAGIC", f_bsize=4096, 
f_blocks=7404175, f_bfree=1494139, f_bavail=1494139, f_files=3768320, 
f_ffree=3064836, f_fsid={-866525212, 419953648}, f_namelen=255, f_frsize=4096}) 
= 0
[pid 18794] read(7, "grep\0statfs\0", 4096) = 12
[pid 18790] statfs64("/lib/init/rw", 84, {f_type=0x1021994, f_bsize=4096, 
f_blocks=128263, f_bfree=128263, f_bavail=128263, f_files=128263, 
f_ffree=128258, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
[pid 18790] statfs64("/proc", 84, {f_type="PROC_SUPER_MAGIC", f_bsize=4096, 
f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, 
f_namelen=255, f_frsize=4096}) = 0
[pid 18790] statfs64("/sys", 84, {f_type="SYSFS_MAGIC", f_bsize=4096, 
f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, 
f_namelen=255, f_frsize=4096}) = 0
[pid 18790] statfs64("/dev", 84, {f_type=0x1021994, f_bsize=4096, 
f_blocks=2560, f_bfree=2524, f_bavail=2524, f_files=128263, f_ffree=127038, 
f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
[pid 18790] statfs64("/dev/shm", 84, {f_type=0x1021994, f_bsize=4096, 
f_blocks=128263, f_bfree=128263, f_bavail=128263, f_files=128263, 
f_ffree=128262, f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
[pid 18790] statfs64("/dev/pts", 84, {f_type="DEVPTS_SUPER_MAGIC", 
f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, 
f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
[pid 18790] statfs64("/sys/fs/fuse/connections", 84, {f_type=0x65735543, 
f_bsize=4096, f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, 
f_fsid={0, 0}, f_namelen=255, f_frsize=4096}) = 0
[pid 18790] statfs64("/home/joey/mnt", 84, {f_type=0x65735546, f_bsize=0, 
f_blocks=0, f_bfree=0, f_bavail=0, f_files=0, f_ffree=0, f_fsid={0, 0}, 
f_namelen=0, f_frsize=0}) = 0

And that *is* a bug, I'm sure. dh.c runs this code much later than
seems to be necessary:

                if (ignorelist_match (il_fstype, mnt_ptr->type))
                        continue;
                if (ignorelist_match (il_mountpoint, mnt_ptr->dir))
                        continue;

I think that should definitly be moved to above the statfs call. And
then I *suggest* the default config be changed to something like:

<Plugin df>
        FSType "nfs"
        IgnoreSelected true
</Plugin>

-- System Information:
Debian Release: squeeze/sid
  APT prefers unstable
  APT policy: (500, 'unstable'), (500, 'testing')
Architecture: i386 (i686)

Kernel: Linux 2.6.31-rc3-486
Locale: LANG=en_US.UTF-8, LC_CTYPE=en_US.UTF-8 (charmap=UTF-8)
Shell: /bin/sh linked to /bin/bash

Versions of packages collectd depends on:
ii  cdebconf [debconf-2.0]        0.146      Debian Configuration Management Sy
ii  debconf [debconf-2.0]         1.5.27     Debian configuration management sy
ii  libc6                         2.9-25     GNU C Library: Shared libraries
ii  librrd4                       1.3.8-1    Time-series data storage and displ

Versions of packages collectd recommends:
ii  iptables                  1.4.4-2        administration tools for packet fi
ii  libatk1.0-0               1.26.0-1       The ATK accessibility toolkit
ii  libcairo2                 1.8.8-2        The Cairo 2D vector graphics libra
ii  libcurl3-gnutls           7.19.5-1       Multi-protocol file transfer libra
ii  libdbi0                   0.8.2-3        Database Independent Abstraction L
ii  libdbus-1-3               1.2.16-2       simple interprocess messaging syst
ii  libdbus-glib-1-2          0.82-1         simple interprocess messaging syst
ii  libesmtp5                 1.0.4-2        LibESMTP SMTP client library
ii  libfontconfig1            2.6.0-4        generic font configuration library
ii  libfreetype6              2.3.9-5        FreeType 2 font engine, shared lib
ii  libglib2.0-0              2.20.4-1       The GLib library of C routines
ii  libgtk2.0-0               2.16.5-1       The GTK+ graphical user interface 
ii  libhal1                   0.5.13-3       Hardware Abstraction Layer - share
ii  libmysqlclient15off       5.0.84-1       MySQL database client library
ii  libnotify1 [libnotify1-gt 0.4.5-1        sends desktop notifications to a n
ii  libopenipmi0              2.0.16-1       Intelligent Platform Management In
ii  liboping0                 1.3.2-1        C/C++ library to generate ICMP ECH
ii  libpango1.0-0             1.24.5-1       Layout and rendering of internatio
ii  libpcap0.8                1.0.0-2        system interface for user-level pa
ii  libperl5.10               5.10.0-25      Shared Perl library
ii  libpq5                    8.4.0-2+b1     PostgreSQL C client library
ii  libsensors3               1:2.10.8-1     library to read temperature/voltag
ii  libsnmp15                 5.4.1~dfsg-12  SNMP (Simple Network Management Pr
ii  libssl0.9.8               0.9.8k-4       SSL shared libraries
ii  libupsclient1             2.4.1-3        network UPS tools - client library
ii  libvirt0                  0.6.5-3        library for interfacing with diffe
ii  libxml2                   2.7.3.dfsg-2.1 GNOME XML library
ii  lm-sensors                1:3.1.1-3      utilities to read temperature/volt
ii  perl                      5.10.0-25      Larry Wall's Practical Extraction 
ii  rrdtool                   1.3.8-1        Time-series data storage and displ

Versions of packages collectd suggests:
ii  apache2-mpm-worker [httpd-cg 2.2.12-1    Apache HTTP Server - high speed th
pn  collectd-dev                 <none>      (no description available)
pn  hddtemp                      <none>      (no description available)
ii  libconfig-general-perl       2.42-1      Generic Configuration Module
ii  libhtml-parser-perl          3.62-1      collection of modules that parse H
ii  libregexp-common-perl        2.122-1     Provide commonly requested regular
ii  librrds-perl                 1.3.8-1     Time-series data storage and displ
ii  liburi-perl                  1.37+dfsg-1 Manipulates and accesses URI strin
pn  mbmon                        <none>      (no description available)

-- debconf information excluded

-- 
see shy jo

Attachment: signature.asc
Description: Digital signature

Reply via email to