on Tue, Mar 27, 2001 at 02:52:51PM -0500, William T Wilson ([EMAIL PROTECTED]) 
wrote:
> On Tue, 27 Mar 2001, Miguel S. Filipe wrote:
> 
> > >>>>> I need to delete a bunch of files, all of them of the form 
> > >>>>> *.doc, scattered into several subdirectories inside a given
> > >>>>> directory. What should I do?
> > >>>> 
> > >> <snip> 
> > >> 
> > >>> Several options:
> > >>>   - Create a script.  This *is* my preferred method.
> > >>> 
> > >>>      $ find . -type f -name \*.doc | sed -e '/.*/s//rm &/' > rmscript
> > >>>      # Edit the script to make sure it's got The Right Stuff
> > >>>      $ vi rmscript
> > >>>      # run it
> > >>>      $ chmod +x rmscript; ./rmscript
> 
> I missed the original question, but I'd like to point out that this is a
> great deal of unnecessary fuss.
> 
> You can do this all with one invocation of find, skipping the script, the
> sed, and all that entirely.  Here we go:
> 
> find . -name '*doc' -exec rm {} \; -print

You did miss the orignal post, 'coz I covered that.

The point being, that in removing a whole slew of stuff, it's sometimes
helpful to actually _look_ at what you're doing.  In which case the -ok
option to find is IMVAO worse than useless -- after the first three
confirms, the user's running on autopilot.

FWIW, I've been taking advantage of shell features to do some
modifications on a few files.

Modifications:

  - Rename them from *.txt to *.html.  For this I used an explicit
    listing with an entry for each file.  Pedantic to the extreme, but
    it works.

  - Strip away DOS artifacts (character feeds).  Correct and format HTML
    via tidy.  Rewrite URLs to point to local relative reference rather
    than the dead original site.  Create an index of key entry points to
    the files.

Total:  52 lines of code.  Well, except for the rewriting script, which
ran about 125,000 lines.  You see, it was 124,656 files (a four year
archive of a website discussion board now defunct).  Heady use of lists,
find, and xargs, along with a touch of sed.  Some five or six hours of
runtime (laptop drives are slow).

Point:  Use tools appropriate to the job.  Sometimes interactively
editing your script *is* the right way to fly.  Though in general, I
prefer producing the entire script programmatically, eyeballing it very
hard several times, and committing it.  Why?  Because programmatic
generation means consistent breakage.  If there's going to be an error,
it's going to be either:

  - Consistent.  *Every* line's going to be fscked badly.
  - Inconsistent.  In which case, the mangled command lines will stand
    out as different in a list of otherwise largely similar commands.

Either way, you're pretty sure to catch errors before they're committed.

Cheers.

-- 
Karsten M. Self <kmself@ix.netcom.com>    http://kmself.home.netcom.com/
 What part of "Gestalt" don't you understand?       There is no K5 cabal
  http://gestalt-system.sourceforge.net/         http://www.kuro5hin.org

Attachment: pgpxSTEaBvFls.pgp
Description: PGP signature

Reply via email to