On Tue, 2004-04-13 at 22:51, Paul Eggert wrote: > Jim Meyering <[EMAIL PROTECTED]> writes: > > >>> > http://oss.oracle.com/projects/ocfs/dist/files/source/RHAT/RHAS3/coreutils-4.5.3-33.src.rpm > > I briefly looked at the following patches in that RPM: > > coreutils-4.5.3-O_DIRECT-NFS.patch > coreutils-4.5.3-O_DIRECT-dd.patch > coreutils-4.5.3-O_DIRECT-valloc.patch > coreutils-4.5.3-o_direct-copy-valloc.patch > coreutils-4.5.3-o_direct.patch > > and I found the following differences between those ideas and what's > in coreutils CVS right now: > > * Coreutils dd simply aligns the I/O buffers to getpagesize() > boundaries, 4.5.3-33 has a complicated alignment strategy that I > don't fully follow, but which seems to do the same thing. > (There may be some differences if I/O errors occur; is that > the point?)
Partly, there's a sub annoyance that I was trying to handle regarding RHAT's AS2.1 kernels which is a kernel limitation regarding reads and FS types that did or did not support O_DIRECT. (this is all from the time of when O_DIRECT in the kernel was quite new). I was trying to make this a non issue (it's fixed in RHAT's AS3, however we're tied to supporting AS2.1 as long as RHAT does, joy...) I'm not sure when RHAT's AS2.1 support dies but I'm hoping soon then all that cruft that I put in can be stripped out. > * 4.5.3-33 aligns buffers to page size boundaries in copy.c. > This looks to me like it's worth doing (independently of O_DIRECT), > so I'll propose a patch along those lines via separate email > to bug-coreutils. > > * cp, mv, and md5sum have --o_direct options. I'm not convinced that > md5sum needs this (why not all the other commands that read files, > too, while you're at it? cat, say?) but perhaps cp and mv should > have it (what are the application areas here?). Also, option names > should not have underscores, so I'd suggest --direct (or perhaps > --direct-io) as a better name for this sort of option. The reasoning behind md5sum was,.. the boss wanted it. Actually it's because in automated scripting if you write a large file using O_DIRECT, it may not be fully committed to disk after the write process has finished. The upshot of this is that you will not be able to access the file until it's fully committed and md5sum could suddenly encounter a 'permission denied' condition. by using O_DIRECT, the access is channelled by the O_DIRECT mechanisms in the kernel and can read the data even though it has not hit the disk, avoiding this 'permission denied' issue. I didn't convert more than tar/mv/cp/dd/md5sum because I'm pressed for time on other projects. As for why should cp/mv have these options? I'll direct your attention to the following chart http://oss.oracle.com/~bryce/cp.gif when you're talking about 4Tb databases the savings in time and coffee are substantial,.. (8 hours vs 2.5 hours for a backup copy) The --o_direct name was, well, I was stuck for another name to call it at the time, it kinda stuck. It's no biggie for it to be tossed on the fire. > * The dd options are spelled differently, e.g.: > > dd ibs=512 obs=1024 iflags=direct oflags=direct (coreutils CVS) > dd --o_direct=512,1024 (4.5.3-33) > > Here I prefer the coreutils CVS version as it's a bit more orthogonal. Aye, again there is method in my madness (maybe) there are situations where you can be asked to read from a non O_DIRECT capable FS/stream to an O_DIRECT FS or vice verse eg dd --o_direct=8192,0 if=some_O_DIRECT_file | gzip -f > backup_file.gz I was trying to sort that out by limiting the o_direct method to a long option which modified the behavior ie --o_direct by itself would assume both sides were O_DIRECT capable and attempt to automatically determine the correct block size it could also be passed two arguments for read and write block sizes where a 0 would denote that the read or the write was to be done via non o_direct methods. A -1 would try and determine what the O_DIRECT block size was. The reasoning behind this was for systems where two O_DIRECT FS's were in use that had DIFFERING block sizes. This facilitated the need to be able to say copy a 128K chunk over to a an FS in say 32K chunks, hence all that fun code in full_write.c/safe_read.c I wrote. eg cp --o_direct=0,8192 /normal_fs/file /o_direct_fs/file by comparison, my convolution of embedding the status in the arguments is expressed by your iflags/oflags options which is better to be honest I was simply trying too hard to have the functionality embedded in the one option. OK, next step, what would you like from me? Phil =--= _______________________________________________ Bug-coreutils mailing list [EMAIL PROTECTED] http://mail.gnu.org/mailman/listinfo/bug-coreutils
