On 7/15/13, Philip Guenther <guent...@gmail.com> wrote: > On Sun, Jul 14, 2013 at 12:54 AM, Rafael Neves <rafaelne...@gmail.com> > wrote: >> The patch below fixes a bug on sysctl(8) introduced by revision 1.191 >> of sysctl.c. After rev1.191, `sysctl vfs' mangles information about >> filesystems (mounted instances of ffs are attributed to nfs, of nfs >> are atrributed to mfs, and so on). As a consequence, `sysctl >> vfs.mounts.nfs' reports 0 mounted instances on a diskless(8) setup, >> thus /etc/rc script (lines 335 to 342) doesn't add pf rules that allow >> NFS, and system hangs when it enables pf. > > First off: thank you for (a) noticing this, and (b) tracking down the > mismatch. > > ... >> --- sysctl.c 9 Jun 2013 12:54:38 -0000 1.192 >> +++ sysctl.c 14 Jul 2013 07:09:28 -0000 >> @@ -1175,8 +1175,8 @@ vfsinit(void) >> >> vfsname[0].ctl_name = "mounts"; >> vfsname[0].ctl_type = CTLTYPE_NODE; >> - vfsvars[0].list = vfsname + 1; >> - vfsvars[0].size = maxtypenum - 1; >> + vfsvars[0].list = vfsname; >> + vfsvars[0].size = maxtypenum; > > Soooooo close... > > While this fixes the observed problem, it's not 100% correct. The > glitch is that it fails a negative test: the command > sysctl vfs.mounts.mounts > should fail with the error > sysctl: third level name mounts in vfs.mounts.mounts is invalid >
Great catch! You're totally right. Thank you for point it out. > but with your patch it silently succeeds. The vfsname list is offset > by one in vfsvars[0].list to prevent that, so the fix that avoids the > unwanted match against vfsname[0] is to keep the offset, but undo it > in the lookup: Aaahhh, now I understand why that line of code and why the pointer calculation doesn't take a multiple of struct size or something like that. Many thanks, I never would figure it out alone. > > --- sysctl.c 9 Jun 2013 12:54:38 -0000 1.192 > +++ sysctl.c 15 Jul 2013 03:43:27 -0000 > @@ -1200,7 +1200,7 @@ sysctl_vfsgen(char *string, char **bufpp > > mib[1] = VFS_GENERIC; > mib[2] = VFS_CONF; > - mib[3] = indx; > + mib[3] = indx + 1; > size = sizeof vfc; > if (sysctl(mib, 4, &vfc, &size, (void *)0, (size_t)0) < 0) { > if (errno != EOPNOTSUPP) > > > That make sense? > Yes, it does. Actually, first I thought to change sysctl_vfsgen() to take account the new index scheme of vfsvars, but reading the other sections sysctl.c code I find that generally findname() result is used untouched (for example, as a input for sysctl(3)) in the code, and I thought that change might be much intrusive. I think that behavior is intentional, so all tricks needed to findname() results be meaningful has to be done when generating the list, and not when using the findname() returned value. If my hypothesis about code's design is correct the patch below might be more suitable. It brings your mib[3] fix back to vfsinit() (reverting rev1.191) and adds one more element on vfsvars and vfsname to account for the offset. The major drawback is that it allocates more memory than is actually needed and as I understand memory is very precious on some archs. Despite this drawback, it passes on the negative test, solves the filesystem accounting problem (even with FUSE enabled). I'm not sure about correctness of my hypothesis about code's design nor about that patch is better than simply adjust mib[3] entry, and I really appreciate opinions on both these issues. > > Philip Guenther > Many thanks for the review, and sorry for delay! My results with patch applied (custom kernel with FUSE enabled): mount(8) output: /dev/wd0a on / type ffs (local) /dev/wd0k on /home type ffs (local, nodev, nosuid) /dev/wd0d on /tmp type ffs (local, nodev, nosuid) /dev/wd0f on /usr type ffs (local, nodev) /dev/wd0g on /usr/X11R6 type ffs (local, nodev) /dev/wd0h on /usr/local type ffs (local, nodev) /dev/wd0j on /usr/obj type ffs (local, nodev, nosuid) /dev/wd0e on /var type ffs (local, nodev, nosuid) /dev/sd2i on /usr/src type ffs (local, nodev, nosuid, softdep) /dev/sd1a on /mnt type ext2fs (local) sysctl(8) output: vfs.mounts.ffs has 9 mounted instances vfs.mounts.ext2fs has 1 mounted instance vfs.ffs.doclusterread=1 vfs.ffs.doclusterwrite=1 vfs.ffs.doreallocblks=1 vfs.ffs.doasyncfree=1 vfs.ffs.max_softdeps=23704 vfs.ffs.sd_tickdelay=2 vfs.ffs.sd_worklist_push=0 vfs.ffs.sd_blk_limit_push=0 vfs.ffs.sd_ino_limit_push=0 vfs.ffs.sd_blk_limit_hit=0 vfs.ffs.sd_ino_limit_hit=0 vfs.ffs.sd_sync_limit_hit=0 vfs.ffs.sd_indir_blk_ptrs=0 vfs.ffs.sd_inode_bitmap=5 vfs.ffs.sd_direct_blk_ptrs=4 vfs.ffs.sd_dir_entry=2 vfs.ffs.dirhash_dirsize=2560 vfs.ffs.dirhash_maxmem=2097152 vfs.ffs.dirhash_mem=50559 vfs.nfs.iothreads=-1 vfs.fuse.fusefs_open_devices=0 vfs.fuse.fusefs_fbufs_in=0 vfs.fuse.fusefs_fbufs_wait=0 vfs.fuse.fusefs_pool_pages=0 Patch: Index: sysctl.c =================================================================== RCS file: /cvs/src/sbin/sysctl/sysctl.c,v retrieving revision 1.192 diff -u -p -r1.192 sysctl.c --- sysctl.c 9 Jun 2013 12:54:38 -0000 1.192 +++ sysctl.c 18 Jul 2013 03:47:31 -0000 @@ -1127,7 +1127,7 @@ vfsinit(void) buflen = 4; if (sysctl(mib, 3, &maxtypenum, &buflen, (void *)0, (size_t)0) < 0) return; - maxtypenum++; /* + generic (0) */ + maxtypenum += 2; /* generic (0) + account for vfsvars[0].list offset */ if ((vfs_typenums = calloc(maxtypenum, sizeof(int))) == NULL) return; if ((vfsvars = calloc(maxtypenum, sizeof(*vfsvars))) == NULL) { @@ -1142,7 +1142,7 @@ vfsinit(void) mib[2] = VFS_CONF; buflen = sizeof vfc; for (loc = lastused, cnt = 1; cnt < maxtypenum; cnt++) { - mib[3] = cnt; + mib[3] = cnt - 1; if (sysctl(mib, 4, &vfc, &buflen, (void *)0, (size_t)0) < 0) { if (errno == EOPNOTSUPP) continue;