On Wed, 8 Nov 2006 15:23:44 -0500 Joey Hess <[EMAIL PROTECTED]> wrote:
> > # usage: URLgrep pattern URL > > # (ad hoc grep switches return first instance of 'pattern' > > # in URL and next line, with numbered lines.) > > URLgrep() { wget -o /dev/null --output-document=- "$2" | > > html2text -ascii -nobs | grep -in -A 1 -m 1 "$1" ; } > > The fact that this can be implemented as a simple shell pipeline (or > more likely, as many different shell pipelines, depending on exact > need) is a good indication that it's not a good candidate for > moreutils. I think you're at half right, if not for the same reasons, but let's start with the other half... The premise that a util should not exist based on its comparitive ease of implementation seem questionable. Why have a 'head' or 'tail' command if a 'sed' or 'awk' one-liner can do the same? It's not likely that ordinary users would remember byzantine switchery like '-o /dev/null --output-document=- "$2"' or '-ascii -nobs'. I can't, yet I wrote it; it doesn't seem "simple" to me. It took a lot of trial and error to find switches that worked. Even just having the knowledge that the 'wget' and 'html2text' utils existed or could be piped isn't something to take for granted. On the other hand you're right to notice that 'URLgrep' is too ad hoc, that got me thinking the flaw of 'URLgrep' is that it's not general enough. A 'URLcat' would be much more general. URLcat() { wget -o /dev/null --output-document=- "$1" | html2text -ascii -nobs ; } (I geuss if it's called 'cat', there probably should be a loop so it can take a lot of URLs and actaully 'cat' them...) Then we can pipe that to 'grep' or 'wc' or just about anything, and have all the command line switches of those utils for free. It's even "simpler" than before and yet a much more versatile tool. -- To UNSUBSCRIBE, email to [EMAIL PROTECTED] with a subject of "unsubscribe". Trouble? Contact [EMAIL PROTECTED]