Control: clone -1 -2 Control: reassign -2 passwd Control: retitle -2 groupdel quite slow with many users in /etc/passwd thanks
Dead Shadow Maintainers, this is a one decade old bug file against adduser, saying that the check done on group deletion to find out whether there is still a user having the to-be-deleted group as primary group, takes quite a long time if the system has many users. While investigating this, Matt Barry from the adduser team found out that groupdel in fact suffers from the same issue and that even removing the offending code from delgroup (relying entirely on the identical check that groupdel does) only speeds up group deletion by a mere 10-15%. I am cloning and reassigning the clone to passwd to give you the possibility of investigating this in groupdel and probably implementing a more efficient way of doing this check. Greetings Marc On Thu, Jul 19, 2012 at 04:24:21PM -0400, Daniel Papasian wrote: > From: Daniel Papasian <dan...@google.com> > Subject: Bug#682156: delgroup I/O requirements are O(n^2) with regards to > number of configured users > To: sub...@bugs.debian.org > Reply-To: Daniel Papasian <dan...@google.com>, 682...@bugs.debian.org > Date: Thu, 19 Jul 2012 16:24:21 -0400 > > Package: adduser > Version: 3.112+nmu2 > > delgroup is a wrapper to groupdel which performs additional > validations. It checks to see whether any other user on the system > has, as its primary group, the group that it is potentially deleting. > > It does so with the following code: > > setpwent; > while ((my $acctname,my $primgrp) = (getpwent)[0,3]) { > if( $primgrp eq $gr_gid ) { > fail (7, gtx("`%s' still has `%s' as their primary > group!\n"),$acctname,$group); > } > } > endpwent; > > Perl's implementation of getpwent will call getspnam() for each user > to get the shadow password. On a default system (using /etc/passwd > and /etc/shadow) this means, for each line of /etc/passwd, > perl will open /etc/shadow, scan it until it finds the matching user, > and close the file. Given adding users adds lines to /etc/passwd and > /etc/shadow, this means the overall I/O complexity for deleting > a group is O(n^2). On systems with ~100k users this quickly ends up > being hundreds of gigabytes that needs to be read and processed in > order to remove a group; given how often delgroup gets > called from postrm scripts this can make a lot of operations rather expensive. > > groupdel performs the same check from C, using getpwent() without > calling the getspnam(), so safety-wise this check is not needed. The > only thing we gain from it is the opportunity to detect > the error before printing "Removing group ..." and calling groupdel. > groupdel has a return value specifically for this case (it will return > 8) in the event we wanted to make any behavior conditional > on this case. > > The simplest fix is to simply remove the offending lines of perl > entirely. This will result in a slightly different output being > printed when attempting to remove a group in use, but will otherwise > behave the same, as so: > > With current delgroup: > > # delgroup root > /usr/sbin/delgroup: `root' still has `root' as their primary group! > > If offending code were simply removed: > > # delgroup root > Removing group `root' ... > groupdel: cannot remove the primary group of user 'root' > /usr/sbin/delgroup: `/usr/sbin/groupdel root' returned error code 8. Exiting. > > The bug was introduced in this commit: > http://anonscm.debian.org/viewvc/adduser/trunk/deluser?r1=233&r2=234& > > and it's entirely plausible (I haven't checked) that groupdel didn't > have any check at all at this point in time. If people believe it's > unacceptable to simply remove the check and rely on groupdel to fail, > this also suggests an alternate approach to fixing this bug -- calling > out to grep via a subshell will be O(n) instead of O(n^2) and should > work fine on systems with much larger numbers of users. > > Daniel