Hi list,
I've already emailed Ben but I forgot to email the list also. Below is a
small script for a *nix system to find and pretty print a list of duplicate
files. Maybe that will help someone trying to do the same thing.
--Josh
---cut--- COMMAND LINE ---cut---
find ./perl-5.6.1 -type f -ls | perl dup_find.pl
---cut--- dup_find.pl ---cut---
#!/usr/bin/perl -w
use strict;
my %files;
#
# expects input from a 'find . -type f -ls' command like so:
#
# 2326616 4 -r--r--r-- 1 josh users 3619 Jan 9 11:46
../perl-5.6.1/lib/XSLoader.pm
#
while(<>) {
chomp;
my $line = $_;
my @f = split(' ', $line, 11);
my ($file_name) = $f[10] =~ m/\/([^\/]+)$/;
my $sig = $file_name . $f[6];
if(exists $files{$sig}) {
push @{$files{$sig}{'dups'}}, $line;
} else {
$files{$sig}{'orig'} = $line;
$files{$sig}{'dups'} = [];
}
}
foreach my $sig (sort keys %files) {
my $orig = $files{$sig}{'orig'};
my @dups = @{$files{$sig}{'dups'}};
foreach ($orig, @dups) {
s/^\s+//;
s/\s+$//;
}
if($#dups != -1) {
print "File: $orig\n";
print "Duplicate: ";
print join("\nDuplicate: ", @dups);
print "\n\n";
}
}
---cut--- OUTPUT ---cut---
File: 1310726 56 -r--r--r-- 1 josh users 49651 Apr 6
2001 ./perl-5.6.1/ext/B/B/C.pm
Duplicate: 3653636 56 -r--r--r-- 1 josh users 49651 Apr 6
2001 ./perl-5.6.1/lib/B/C.pm
File: 1310727 60 -r--r--r-- 1 josh users 56243 Mar 19
2001 ./perl-5.6.1/ext/B/B/CC.pm
Duplicate: 3653641 60 -r--r--r-- 1 josh users 56243 Mar 19
2001 ./perl-5.6.1/lib/B/CC.pm
File: 1310728 28 -r--r--r-- 1 josh users 25562 Apr 8
2001 ./perl-5.6.1/ext/B/B/Concise.pm
Duplicate: 3653638 28 -r--r--r-- 1 josh users 25562 Apr 8
2001 ./perl-5.6.1/lib/B/Concise.pm
....
----- Original Message -----
From: "Ben Crane" <[EMAIL PROTECTED]>
To: <[EMAIL PROTECTED]>
Sent: Thursday, January 10, 2002 8:39 AM
Subject: traversing a file tree...one step further
> Hi list,
>
> I have got a program running that opens a text file,
> finds file names within the txt file and then runs a
> file::find to determine the location of these files
> (within the same directory). My next question is: If i
> want to write a list of all the files within more than
> one directory how do I do it.
>
> I have the initial start directory, it locates all the
> files within it and prints them out. but within the
> directory are more subdirectories...what I want is to
> produce a list of every file within the main directory
> and the sub directories.
>
> why? the files in these directories change on a
> constant basis and I want a txt file (updated every
> day) to determine what's there and what isn't...its
> part of our corporate website and simple file
> management is becoming very hard.
>
> I was thinking of using file::depth but am not
> entirely sure if it's the right solution. My next idea
> was to put a list of sub directories in an array and
> then loop through the array opening each respective
> sub direc and printing the files within...at the
> moment my text file returns a set of files within the
> directory and a list of sub directories...if this
> method is simple, how do I dump info into an array for
> later use (the array will have data added to it whilst
> inside a foreach(..) loop...
>
> Any/all help would be appreciated.
>
>
> __________________________________________________
> Do You Yahoo!?
> Send FREE video emails in Yahoo! Mail!
> http://promo.yahoo.com/videomail/
>
> --
> To unsubscribe, e-mail: [EMAIL PROTECTED]
> For additional commands, e-mail: [EMAIL PROTECTED]
>
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]