Package: lvm2
Version: 2.03.11-2.1

In my experience, lvm2 has at least one bug where it sometimes gets
the whole lvm system into a stuck state: lvchange, lvdisplay, etc.,
all hang.

Usually, I see this on my personal laptop when using schroot with lvm
snapshot chroots.  There, the problem seems to be correlated with
pressing ^C on autopkgtest or sbuild commands.  This problem has been
occurring for me for many many years.  I assumed there was something
wrong with my install or my usage patterns.

I am now reorting this because a very stuck state arose on my colo
machine, chiark, which has a very different setup.  Once again I'm not
sure what the trigger is.  But this lvm hang problem has coincided
with other problems, including at least exhaustion of systsem
resources leading to fork failing with EAGAIN.  There are also some
suggestions that the system might have exceeded its fd limit
(#1136218) but I have no direct evidence of that.

stracing lvdisplay showed this

openat(AT_FDCWD, "/run/lock/lvm/V_vg-main", O_RDWR|O_CREAT|O_APPEND, 0777) = 4
flock(4, LOCK_SH [hangs]

I used fuser on that file and found an lvcreate process from 5 days
ago.  strace on that showed it hanging in semtimedop.  (I never use
the sem* functions in my own programs because I wasn't able to figure
out a way to make usage of them leak- and race- free.)  I sent it a
SIGTERM which it ignored.  So I sent it a SIGKILL.

strace: Process 6243 attached
semtimedop(1278004, [{0, 0, 0}], 1, NULL) = ?
+++ killed by SIGKILL +++

Weirdly, after that, I found that the LV that was supposed to be being
created existed, and was in use!  It's part of a backup system and
that could only happen if the backup system thought lvcreate had
succeeded.

Anyay, after that, I was able to kill the leftover processes, unmount
the snapshot, and remove the lv, and everything seems normal now.

I am filing this bug mostly to have somewhere to report recurrences,
and as an opportunity for you to suggest information-gathering steps
that might be useful to diagnose the problem.

I'm reporting the bug against bullseye's lvm2, but my laptop is
running trixie and also suffers from lockups.

Ian.

-- 
Ian Jackson <[email protected]>   These opinions are my own.  

Pronouns: they/he.  If I emailed you from @fyvzl.net or @evade.org.uk,
that is a private address which bypasses my fierce spamfilter.

Reply via email to