I think there will be some optimizations always possible, but You wont
get any dramatic improvements.
What I would do is something like this
First make sure that all the data is sorted in the file
Create a sequence array of all the required numbers, In your example it
was all numbers from 1..10
#!/usr/bin/perl
#
#
my @full_sequence = (1..10);
my @missing=();
my $i=0;
while(<DATA>){
chomp;
while($full_sequence[$i] != $_) {
push @missing , $full_sequence[$i++] ;
}
$i++
}
print "MISSING ARE @missing\n";
exit 0;
___ END__
Just try that out , hope it will be better. Infact your requirement is
very data specific , It will hardly make any difference wether you code
in PERL or in C
Bye
Ram
On Thu, 2004-05-13 at 04:09, Larry Wissink wrote:
> I have a problem that I thought would be perfect for Perl, except that I
> seem to be using all my system resources to run it. Of course this
> probably means I'm doing it the wrong way...
>
>
>
> The problem:
>
> We have a backup server that is missing records from the production
> server for a particular table. We know that it should have sequential
> records and that it is missing some records. We want to get a sense of
> the number of records missing. So, we know the problem started around
> the beginning of March at id 70,000,000 (rounded for convenience).
> Currently we are at 79,000,000. So, I dumped to a file all the ids
> between 70,000,000 and 79,000,000 (commas inserted here for
> readability). I need to figure out what numbers are missing. The way
> that seemed easiest to me was to create two arrays. One with every
> number between 70 and 79 million, the other with every number in our
> dump file. Then compare them as illustrated in the Perl Cookbook using
> a hash.
>
> The simple script I came up with works fine with a test file of just 10
> records.
>
> But, when I try to scale that to 9 million records, it doesn't work.
> This is probably because it is trying to do something like what db
> people call a cartesian join (every record against every record).
>
> So, does anybody have a suggestion for a better way to do it in Perl?
>
>
>
> I'll probably end up doing it in SQL if I can't come up with a Perl
> solution. (Create a second table like the first array with every number
> between 70 and 79 million, and join the two tables.)
>
>
>
> Larry
>
> [EMAIL PROTECTED]
>
>
>
> script:
>
>
>
> use strict;
>
> use warnings;
>
>
>
> my %seen;
>
> my @list = ();
>
> my @missing;
>
> my @ids = ();
>
> my $lis;
>
> my $item;
>
>
>
> foreach $lis (1 .. 10) { # sample list of 10
>
> push(@ids, $lis);
>
> }
>
>
>
> open(DATA, "< ms_ids_test.txt") or die "Couldn't open data file: $!\n";
> # create file like below
>
>
>
> while (<DATA>) {
>
> chomp;
>
> push(@list, $_);
>
> }
>
>
>
> @[EMAIL PROTECTED] = ();
>
>
>
> foreach $item (@ids) {
>
> push(@missing, $item) unless exists $seen{$item};
>
> }
>
>
>
> print scalar(@missing);
>
>
>
>
>
> #sample file (without the pounds)
>
> #1
>
> #2
>
> #3
>
> #4
>
> #5
>
> #9
>
> #10
>
> # note missing 6,7,8
>
> # result is 3
>
>
>
--
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>