Hello PHP-list,

I'm building a script, that provides a honeypot of invalid email
addresses for spambots.. for this I want to provide a macro for the
templates that looks like %rand[x]-[y]%, where [x] and [y] are
integers, that specify the length of the random script.
My first thoughts were about the best parsing of the %rand..%-part,
but now I came to a point, where I could also need suggestions on the
random string generation..
It considers very basic word generations and the only meaningful word
I discovered was 'Java'.. *g

For generation of a random string with length 1.000.000 it takes about
13 seconds on my xp 1600+.. that's quite a lot, imho, so suggestions
are very welcome..

the script goes here, ready to copy'n'paste:

--------------------------------------------------------------
list($low, $high) = explode(" ", microtime());
$this->timerstart = $high + $low;

function parserandstr($toparse){
 $debug = 0;

 $new = '';
 $ch = array(
    'punct' => array('.', '.', '.', '..', '!', '!!', '?!'),
    'sep' => array(', ', ' - '),
    'vocal' => array('a', 'e', 'i', 'o', 'u'),
    'cons_low' => array('x', 'y', 'z'),
    'cons_norm' => array('b', 'c', 'd', 'f', 'g', 'h', 'j', 'k',
                 'l', 'm', 'n', 'p', 'q', 'r', 's', 't', 'v', 'w')
 );
 while ( ($pos = strpos($toparse, '%rand')) !== FALSE){
  if ($debug) echo '<br><br>$pos: ' . $pos;
  $new .= substr($toparse, 0, $pos);
  if ($debug) echo '<br>$new: "' . $new . '"';

  $toparse = substr($toparse, $pos + 5);
  if ($debug) echo '<br>$toparse: "' . $toparse . '"';
  
  $posclose = strpos($toparse, '%', 1);
  if ($debug) echo '<br>$posclose: "' . $posclose . '"';
  if ($posclose){
   $rlength = substr($toparse, 0, $posclose);
   if ($debug) echo '<br>$rlength: "' . $rlength . '"';
   
   $possep = strpos($rlength, '-');
   $minrlen = substr($rlength, 0, $possep);
   $maxrlen = substr($rlength, $possep + 1);
   if ($debug) echo '<br>$minrlen: "' . $minrlen . '"';
   if ($debug) echo '<br>$maxrlen: "' . $maxrlen . '"';
   
   $rlen = rand($minrlen, $maxrlen);
   
   // generate random string
   $randstr = ''; $inword = 0; $insentence = 0; $lastchar = '';
   for($j = 0; $j < $rlen; $j++){
      if ($inword > 3 && rand(0, 5) == 1) {  // punctuation chars
       if (rand(0,5) > 0) $char = ' ';
       else {
        $char = $ch['punct'][rand(0, count($ch['punct'])-1)] . ' ';
        $j += strlen($char)-1;
        $insentence = 0;
       }
       $inword = 0;
      }
      else {
       if (!$lastwasvocal && rand(0, 10) > 6) {  // vocals
        $char = $ch['vocal'][rand(0, count($ch['vocal'])-1)];
        $lastwasvocal = true;
       } else {
          do {
           if (rand(0, 30) > 0)  // normal priority consonants
            $char = $ch['cons_norm'][rand(0, count($ch['cons_norm'])-1)];
           else $char = $ch['cons_low'][rand(0, count($ch['cons_low'])-1)];
          } while ($char == $lastchar);
          $lastwasvocal = false;
       }
       $inword++;
       $insentence++;
    }
    
    if ($insentence == 1 || ($inword == 1 && rand(0, 30) < 10))
     $randstr .= strtoupper($char);
    else $randstr .= $char;
    $lastchar = $char;
   }
   
   $new .= $randstr;
   if ($debug) echo '<br>$new: ' . $new;

   $toparse = substr($toparse, $posclose + 1);
   if ($debug) echo '<br>$toparse: "' . $toparse . '"';
  } else $new .= '%rand';
 }
 return $new . $toparse;
}
 
function pre_dump($var, $desc=''){
 echo '<pre>::'.$desc.'::<br>'; var_dump($var); echo '</pre>';
}

#$s = parserandstr('random string comes here: '
    . '%rand10-1000%. this is a fake %rand and should not be killed..');
$s = parserandstr('%rand200000-200000%');
echo '<br><br>' . $s;
echo '<br><br>' . strlen($s);

list($low, $high) = explode(" ", microtime());
$t    = $high + $low;
printf("<br>loaded in: %.4fs", $t - $this->timerstart);
------------------------------------------------------------


-- 
shinE!
http://www.thequod.de ICQ#152282665
PGP 8.0 key: http://thequod.de/danielhahler.asc

-- 
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php

Reply via email to