Answers to this stuff should really come from Thomas. I am just speaking
directly out of my ass here, without even source on hand to read.
> np->allocsize is the number of bytes allocated for the file, right? (rounded
> up to a block multiple).
I would not put it exactly that way, but yes. allocsize is the number of
bytes allocated on the disk for the file to consume some or all of. It is
not a "rounded size", but rather a size that indicates how many bytes are
necessarily consumed due to the filesystem format's allocation granularity.
For example, in ext2fs the filesystem blocksize (commonly 1k or 4k) is the
granularity of allocation; so on a 1k filesystem, a file that is 1025 bytes
long will have an allocsize of 2048. But another example is ffs (ufs),
which has smaller "fragments" (e.g. 8k blocksize and 1k frag size is
common); depending on the allocation choices made, there may be a whole
block less one byte or only a frag less one byte allocated to a file whose
size is one byte over blocksize.
> Is it okay for np->allocsize to be initially as big as needed to contain the
> file, rather than the actual disk space allocated for the file? AFAICS, it
> is sufficient if diskfs_grow does the right thing in this case.
I believe the point of allocsize is to indicate how many bytes are
available without doing diskfs_grow. So if nodes initially appear to have
less space than they in fact do, then diskfs_grow will just do a "soft"
allocation that doesn't actually have to diddle any allocation metadata on
the disk. Seems fine.
> FAT does not make it easy to determine the length of the cluster chain. You
> have to read it in completely. As this is slow for big files (O(n)), we
> should avoid it in stat, and only do it in diskfs_grow, when it actually
> matters.
I think this approach is fine, though I don't have source handy here to
check all the uses of allocsize.