Package: smstools
Severity: important
Version: 3.1.14-1
Tags: patch

On a monitoring system with otherwie heavy I/O load (due to a large
number of RRDs being updated on a regular basis), it was noticed that
sending of a generated SMS was delayed for hours.

Attaching gdb and strace to the stalled daemon revealed that it was
stuck in a sync(2) call.

Investigation of the source code showed that smsd makes frequent use of
lock files as part of operations that involve reading from spool files
and moving files around in its spool directories. After creating lock
files, sync(2) is called which causes the kernel to write buffered file
metadata modifications *for all filesystems*. This can have significatnt
negative effects on overall filesystem performance.

Lock files have no use after a system reboot and there seems to be no
other part of smstools that is interested in the lock files' contents,
therefore it is unclear why "sync" operation is needed at all. Even if
it was important to preserve the lock file contents across system
crashes, fsync(2) or fdatasync(2) would be the right calls to use.

I suggest simply removing the sync() call from lockfile() in
src/locking.c (that's why I set the "patch" tag.)

As a quick and easy workaround here we have overridden the sync(2) call
on the affected system by running smsd with the eatmydata shared library
preloaded. This has solved our latency issue.

Cheers,
-Hilko


-- 
To UNSUBSCRIBE, email to debian-bugs-dist-requ...@lists.debian.org
with a subject of "unsubscribe". Trouble? Contact listmas...@lists.debian.org

Reply via email to