On 7/15/13, Philip Guenther <guent...@gmail.com> wrote:
> On Sun, Jul 14, 2013 at 12:54 AM, Rafael Neves <rafaelne...@gmail.com>
> wrote:
>> The patch below fixes a bug on sysctl(8) introduced by revision 1.191
>> of sysctl.c. After rev1.191, `sysctl vfs' mangles information about
>> filesystems (mounted instances of ffs are attributed to nfs, of nfs
>> are atrributed to mfs, and so on). As a consequence, `sysctl
>> vfs.mounts.nfs' reports 0 mounted instances on a diskless(8) setup,
>> thus /etc/rc script (lines 335 to 342) doesn't add pf rules that allow
>> NFS, and system hangs when it enables pf.
>
> First off: thank you for (a) noticing this, and (b) tracking down the
> mismatch.
>
> ...
>> --- sysctl.c    9 Jun 2013 12:54:38 -0000       1.192
>> +++ sysctl.c    14 Jul 2013 07:09:28 -0000
>> @@ -1175,8 +1175,8 @@ vfsinit(void)
>>
>>         vfsname[0].ctl_name = "mounts";
>>         vfsname[0].ctl_type = CTLTYPE_NODE;
>> -       vfsvars[0].list = vfsname + 1;
>> -       vfsvars[0].size = maxtypenum - 1;
>> +       vfsvars[0].list = vfsname;
>> +       vfsvars[0].size = maxtypenum;
>
> Soooooo close...
>
> While this fixes the observed problem, it's not 100% correct.  The
> glitch is that it fails a negative test: the command
>      sysctl vfs.mounts.mounts
> should fail with the error
>      sysctl: third level name mounts in vfs.mounts.mounts is invalid
>

Great catch! You're totally right. Thank you for point it out.

> but with your patch it silently succeeds.  The vfsname list is offset
> by one in vfsvars[0].list to prevent that, so the fix that avoids the
> unwanted match against vfsname[0] is to keep the offset, but undo it
> in the lookup:

Aaahhh, now I understand why that line of code and why the pointer
calculation doesn't take a multiple of struct size or something like
that. Many thanks, I never would figure it out alone.

>
> --- sysctl.c    9 Jun 2013 12:54:38 -0000       1.192
> +++ sysctl.c    15 Jul 2013 03:43:27 -0000
> @@ -1200,7 +1200,7 @@ sysctl_vfsgen(char *string, char **bufpp
>
>         mib[1] = VFS_GENERIC;
>         mib[2] = VFS_CONF;
> -       mib[3] = indx;
> +       mib[3] = indx + 1;
>         size = sizeof vfc;
>         if (sysctl(mib, 4, &vfc, &size, (void *)0, (size_t)0) < 0) {
>                 if (errno != EOPNOTSUPP)
>
>
> That make sense?
>

Yes, it does. Actually, first I thought to change sysctl_vfsgen() to
take account the new index scheme of vfsvars, but reading the other
sections sysctl.c code I find that generally findname() result is used
untouched (for example, as a input for sysctl(3)) in the code, and I
thought that change might be much intrusive. I think that behavior is
intentional, so all tricks needed to findname() results be meaningful
has to be done when generating the list, and not when using the
findname() returned value.

If my hypothesis about code's design is correct the patch below might
be more suitable. It brings your mib[3] fix back to vfsinit()
(reverting rev1.191) and adds one more element on vfsvars and vfsname
to account for the offset. The major drawback is that it allocates
more memory than is actually needed and as I understand memory is very
precious on some archs. Despite this drawback, it passes on the
negative test, solves the filesystem accounting problem (even with
FUSE enabled).


I'm not sure about correctness of my hypothesis about code's design
nor about that patch is better than simply adjust mib[3] entry, and I
really appreciate opinions on both these issues.

>
> Philip Guenther
>

Many thanks for the review, and sorry for delay!


My results with patch applied (custom kernel with FUSE enabled):
mount(8) output:
        /dev/wd0a on / type ffs (local)
        /dev/wd0k on /home type ffs (local, nodev, nosuid)
        /dev/wd0d on /tmp type ffs (local, nodev, nosuid)
        /dev/wd0f on /usr type ffs (local, nodev)
        /dev/wd0g on /usr/X11R6 type ffs (local, nodev)
        /dev/wd0h on /usr/local type ffs (local, nodev)
        /dev/wd0j on /usr/obj type ffs (local, nodev, nosuid)
        /dev/wd0e on /var type ffs (local, nodev, nosuid)
        /dev/sd2i on /usr/src type ffs (local, nodev, nosuid, softdep)
        /dev/sd1a on /mnt type ext2fs (local)

sysctl(8) output:
        vfs.mounts.ffs has 9 mounted instances
        vfs.mounts.ext2fs has 1 mounted instance
        vfs.ffs.doclusterread=1
        vfs.ffs.doclusterwrite=1
        vfs.ffs.doreallocblks=1
        vfs.ffs.doasyncfree=1
        vfs.ffs.max_softdeps=23704
        vfs.ffs.sd_tickdelay=2
        vfs.ffs.sd_worklist_push=0
        vfs.ffs.sd_blk_limit_push=0
        vfs.ffs.sd_ino_limit_push=0
        vfs.ffs.sd_blk_limit_hit=0
        vfs.ffs.sd_ino_limit_hit=0
        vfs.ffs.sd_sync_limit_hit=0
        vfs.ffs.sd_indir_blk_ptrs=0
        vfs.ffs.sd_inode_bitmap=5
        vfs.ffs.sd_direct_blk_ptrs=4
        vfs.ffs.sd_dir_entry=2
        vfs.ffs.dirhash_dirsize=2560
        vfs.ffs.dirhash_maxmem=2097152
        vfs.ffs.dirhash_mem=50559
        vfs.nfs.iothreads=-1
        vfs.fuse.fusefs_open_devices=0
        vfs.fuse.fusefs_fbufs_in=0
        vfs.fuse.fusefs_fbufs_wait=0
        vfs.fuse.fusefs_pool_pages=0

Patch:
Index: sysctl.c
===================================================================
RCS file: /cvs/src/sbin/sysctl/sysctl.c,v
retrieving revision 1.192
diff -u -p -r1.192 sysctl.c
--- sysctl.c    9 Jun 2013 12:54:38 -0000       1.192
+++ sysctl.c    18 Jul 2013 03:47:31 -0000
@@ -1127,7 +1127,7 @@ vfsinit(void)
        buflen = 4;
        if (sysctl(mib, 3, &maxtypenum, &buflen, (void *)0, (size_t)0) < 0)
                return;
-       maxtypenum++;   /* + generic (0) */
+       maxtypenum += 2;        /* generic (0) + account for vfsvars[0].list 
offset */
        if ((vfs_typenums = calloc(maxtypenum, sizeof(int))) == NULL)
                return;
        if ((vfsvars = calloc(maxtypenum, sizeof(*vfsvars))) == NULL) {
@@ -1142,7 +1142,7 @@ vfsinit(void)
        mib[2] = VFS_CONF;
        buflen = sizeof vfc;
        for (loc = lastused, cnt = 1; cnt < maxtypenum; cnt++) {
-               mib[3] = cnt;
+               mib[3] = cnt - 1;
                if (sysctl(mib, 4, &vfc, &buflen, (void *)0, (size_t)0) < 0) {
                        if (errno == EOPNOTSUPP)
                                continue;

Reply via email to