--- Dom 11/12/11, Kostik Belousov <kostik...@gmail.com> ha scritto: > > If you wanted to get responses from experts only, sorry in > advance. >
I am no fs expert but just thought I'd mention some things based on my playing with the BSD ext2fs ... > The fs (AKA UFS) uses clustering provided by the block > cache. The clustering > code, mainly located in the kern/vfs_cluster.c, coalesces > sequence of > reads or writes that are targeting the consequtive blocks, > into single > physical read or write of the maximal size of MAXPHYS. > Current definition > of MAXPHYS is 128KB. > The clustering code is really cool and the idea is that it gives UFS the advantages of an extent based fs. I haven't seen benchmarks in UFS2 but on ext2 it didn't seem to work as it should though. One issue is that ext2 doesn't support fragments and as a consequence ext2 will not use big blocksizes. This is a limitation in the ext2 design that UFS doesn't have, but still linux's ext2fs outperforms UFS in async mode (we do shine in sync mode). It was never clear exactly why this happens but it would appear there is a bottleneck in geom that is not good in writing many contiguous blocks. > Clustering allows filesystem to improve the layout of the > files by calling > VOP_REALLOCBLKS() to redo the allocation to make the > writing sequence of > blocks sequential if it is not. > > Even if file is not layed out ideally, or the i/o pattern > is random, most > writes scheduled are asynchronous, and for reads, the > system tries to > schedule read-aheads for some limited number of blocks. > This allows the > lower layers, i.e. geom and disk drivers, to optimize the > i/o queue > to coalesce requests that are consequitive on disk, but not > on the queue. > > BTW, some time ago I was interested in the effect on the > fragmentation > on UFS, due to some semi-abandoned patch, which could make > the > fragmentation worse. I wrote the tool that calculated the > percentage > of non-consequtive spots in the whole filesystem. > Apparently, even > under the hard load consisting of writing a lot of files > under the > megabytes in size, UFS managed to keep the number of spots > under 2-3% on > sufficiently free volume. > Yes, the realloc_blk code is very efficient in that. In fact it is so good it actually hides some inefficient operations in UFS. Bruce had a patch for this that I cc'd to Kirk but the difference was not big because the realloc_blk code does it's job in memory. Zheng Liu did the reallocation thing for ext2fs and it gave better results than preallocation but the results are not as spectacular as in UFS (the UFS code takes advantage of fragments there too). I do expect to commit it (kern/159233) once my mentor reviews and approves it. cheers, Pedro. _______________________________________________ freebsd-current@freebsd.org mailing list http://lists.freebsd.org/mailman/listinfo/freebsd-current To unsubscribe, send any mail to "freebsd-current-unsubscr...@freebsd.org"