Re: Finding missing numbers in sequence

Ramprasad A Padmanabhan Thu, 13 May 2004 04:15:14 -0700

I think there will be some optimizations always possible, but You wont
get any dramatic improvements.


What I would do is something like this

First make sure that all the data is sorted in the file
Create a sequence array of all the required numbers, In your example it
was all numbers from 1..10
#!/usr/bin/perl
#
#
my @full_sequence = (1..10);
my @missing=();
my $i=0;
while(<DATA>){
   chomp;
   while($full_sequence[$i] != $_)  {
      push @missing , $full_sequence[$i++] ; 
   } 
   $i++
}

print "MISSING ARE @missing\n";

exit 0;


___ END__
Just try that out , hope it will be better. Infact your requirement is
very data specific , It will hardly make any difference  wether you code
in PERL or in C 


Bye
Ram


On Thu, 2004-05-13 at 04:09, Larry Wissink wrote:
> I have a problem that I thought would be perfect for Perl, except that I
> seem to be using all my system resources to run it.  Of course this
> probably means I'm doing it the wrong way...
> 
>  
> 
> The problem:
> 
> We have a backup server that is missing records from the production
> server for a particular table.  We know that it should have sequential
> records and that it is missing some records.  We want to get a sense of
> the number of records missing.  So, we know the problem started around
> the beginning of March at id 70,000,000 (rounded for convenience).
> Currently we are at 79,000,000.  So, I dumped to a file all the ids
> between 70,000,000 and 79,000,000 (commas inserted here for
> readability).  I need to figure out what numbers are missing.  The way
> that seemed easiest to me was to create two arrays.  One with every
> number between 70 and 79 million, the other with every number in our
> dump file.  Then compare them as illustrated in the Perl Cookbook using
> a hash.
> 
> The simple script I came up with works fine with a test file of just 10
> records.
> 
> But, when I try to scale that to 9 million records, it doesn't work.
> This is probably because it is trying to do something like what db
> people call a cartesian join (every record against every record).
> 
> So, does anybody have a suggestion for a better way to do it in Perl?
> 
>  
> 
> I'll probably end up doing it in SQL if I can't come up with a Perl
> solution.  (Create a second table like the first array with every number
> between 70 and 79 million, and join the two tables.)  
> 
>  
> 
> Larry
> 
> [EMAIL PROTECTED]
> 
>  
> 
> script:
> 
>  
> 
> use strict;
> 
> use warnings;
> 
>  
> 
> my %seen;
> 
> my @list = ();
> 
> my @missing;
> 
> my @ids = ();
> 
> my $lis;
> 
> my $item;
> 
>  
> 
> foreach $lis (1 .. 10) {     # sample list of 10 
> 
> push(@ids, $lis);
> 
> }
> 
>  
> 
> open(DATA, "< ms_ids_test.txt")  or die "Couldn't open data file: $!\n";
> # create file like below 
> 
>  
> 
> while (<DATA>) {
> 
>             chomp;
> 
>             push(@list, $_);
> 
> }
> 
>  
> 
> @[EMAIL PROTECTED] = ();
> 
>  
> 
> foreach $item (@ids) {
> 
>   push(@missing, $item) unless exists $seen{$item};
> 
>   }
> 
>   
> 
>   print scalar(@missing);
> 
>   
> 
>  
> 
> #sample file (without the pounds)
> 
> #1
> 
> #2
> 
> #3
> 
> #4
> 
> #5
> 
> #9
> 
> #10
> 
> # note missing 6,7,8
> 
> # result is 3 
> 
>  
> 



-- 
To unsubscribe, e-mail: [EMAIL PROTECTED]
For additional commands, e-mail: [EMAIL PROTECTED]
<http://learn.perl.org/> <http://learn.perl.org/first-response>

Re: Finding missing numbers in sequence

Reply via email to