// Check for binaries
$ckbin = 14;
while($ckbin <= 26){
$ck = chr($ckbin);
$cbin = substr_count($read, $ck);
if($cbin > 0){
echo "Killing off binary file URL: $url\n";
$kill = mysql_unbuffered_query("DELETE FROM search WHERE url_id='$url_id'");
continue 2;
}
++$ckbin;
}
I know it looks kind of funky out of context, but it works really great.
Nick
Richard Davey wrote:
Hello Evan,
Monday, February 23, 2004, 8:57:43 PM, you wrote:
It would be wise to check for characters from 0 to 31, if they appear
then it's almost certainly (but not guaranteed) binary.
EN> Assuming that's decimal, you're including 0x09 0x0a and 0x0d which are, EN> respectively, tab, line feed, and carriage return. That's off the top of my EN> head, which means two things: (1) i may be forgetting something, and (2) I EN> need a life ;)
Let me rephrase - check for the existence of characters 0 through 31 and count how many there are. Set a percentage weight yourself and figure out in your script if you deem the quantity too many or too few.
The count_chars() function will be absolutely ideal for this.