Well thanks, that was pretty educational. For the benefit of other curious readers, I'm replying to your points with various examples, which of course you probably don't need.
On Mon, 13 Aug 2007 00:02:59 -0400 Joey Hess <[EMAIL PROTECTED]> wrote: > A. Costa wrote: > > 1) Inelegant. The whole pipe paradigm is so you can stick > > your filter anywhere in the pipe, and it'll do its thing. At > > present 'sponge' only works at the caboose, with > > filename. > > This is a bug in /bin/sh. Let's call it a feature! Would it be desirable to have a pipeline work in series? % time seq 1 11111111 > /dev/null real 0m24.711s user 0m22.125s sys 0m0.372s A program like 'head' for example is fast with the current method: % time seq 1 11111111 | head > /dev/null real 0m0.068s user 0m0.028s sys 0m0.004s ...while 'tail' must wait for the EOF: % time seq 1 11111111 | tail > /dev/null real 0m29.774s user 0m24.174s sys 0m3.688s So parallel processing is a great feature of pipes. Still, maybe if there was a switch or a series pipe, something that did what 'sponge' does, anywhere, so if it had a symbol like "-|-", then the above 'head' example would look like: % time seq 1 11111111 -|- head > /dev/null ...and 'head' would have to wait for EOF, just like 'tail'. (Silly thing to make 'head' do, but it illustrates the point.) Some other shell may already have something like that for all I know. Or maybe you're saying the bug is just in the redirection output operator '>', and not the '|' -- that is, you'd like parallel pipes '|', but a delayed action serial output '>'? > > 2) The docs don't mention that '... | sponge | ...' is > > useless. Better to return an error if there's no filename. > > It's not useless. It does what the man page says it does. > > Consider: > > svn diff | sponge | patch -R -p0 Sorry, not an easy example, IANADD and don't know from 'svn' & 'patch' or their syntax -- others might appreciate it more. > Here the whole diff is generated before patch is allowed to modify the > same files that are being diffed. So the key to that 'svn' example being useful is that there's no output file given? In which case 'sponge' is a util like 'sort', which also takes everything in first: % time seq 1 11111111 | sort | head > /dev/null real 0m22.277s user 0m11.929s sys 0m1.244s ...so 'head' has to wait, because 'sort' eats the whole pipe. > > 3) 'man sponge' states "...sponge soaks up all its input > > before opening the output file..."; which implies standard output, > > The shell doesn't care when sponge opens stdout. As soon as it sees a > redirection to a file, it will open, and zero, the file. For novices, I'd rephrase that: "The shell doesn't care, when sponge opens stdout, because by then its too late. First it sees a redirection to a file, and that's when it will open, and zero, the file." Before I got your reply, I wrote an ad hoc shell script, (attached), which doesn't work with the '>', for exactly that reason. Demo: % seq 1 3 | mysponge.sh 1 2 3 But... % F=/tmp/nums ; seq 1 100000 > $F ; wc $F ; mysponge.sh < $F | sort > $F ; wc $F 100000 100000 588895 /tmp/nums 0 0 0 /tmp/nums Summing up: Sorry about calling plain 'sponge' in a pipe "useless", I was wrong; but 'sponge' is an even better util than it first seemed! That being said, this business about '>' isn't very obvious, so the docs might use some examples, (simple ones, no DD 'svn' stuff), along with explanations of when the pipeline format is best, and when the caboose format is the way to go. Thanks for the corrections and the util itself.
#!/bin/sh T=/tmp/mysponge$$$$ showhelp() { echo "Usage: \"... | $0 | ... \", or \" ... | $0 > foo\"" ; } bail() { [ -f $T ] && rm $T ; exit $1 ; } if [ $# -gt 0 ] then showhelp bail 1 fi cat > $T || bail 1 wait if cat $T then bail 0 else bail 1 fi