It's been 20+ years since I took a stats class...
I didn't enjoy that class, and doubt if I remember 1% of what was
covered.
Given an input Unix date like:
1132565360
And an array of Unix dates like:
array(3) {
[0]=>
int(1132565342)
[1]=>
int(1132565360)
[2]=>
int(1132565359)
}
I would like to return the input date *IF* it is "reasonable" in its
variance from the date values in the array.
E.g, the above would output: 1132565360
If, however, the input date was "0" for the same array of dates, I'd
want to get, errr... Well, okay, the values in the array...
One of those might *ALSO* be wildly wrong. :-(
So I want the "most likely candidate" for a correct date out of all
this mess, where any of the date values might be wrong.
Any ideas?
Is there a nice built-in "sort out this variance mess for me" function
in PHP? :-)
The somewhat maybe obvious candidate of "stats_variance" is a bit
under-documented...
I'm not sure I'd even understand the numbers that came out of it, even
if I experimented with it and *THOUGHT* I understood the numbers
coming out of it.
And the sheer number of functions in the stats package is making my
head spin.
And I dunno if I could get statistics into the shared server anyway.
Maybe I should explain the "Big Picture", eh?
Okay, so the "Big Picture" is 14,000 emails in an Inbox, that need to
be processed, and tagged with their "date".
[And a whole lot more, but not relevant to this post...]
Seems simple enough, with that Date: header.
Except when it's not there. :-(
Okay, so take the Sent: header if there's no Date: header.
Okay.
No, wait... Damn!
Some fools have their PC clock set to, like, 1970 or whatever. So
let's be generous and assume their CMOS battery has died, and they
haven't had a chance to change it. Fine. Deal with it.
Okay, so *NOW* the algorithm is to do this:
Take the Date: header, or Sent: header if no Date: header -> $whatdate
Parse the Received: headers for the MTA date-stamps -> $fromdates[]
Compare the values in $fromdates array with $whatdate.
If the variance is "too high", then ignore the $whatdate, and take
the, errr, first?, average?, $fromdates[].
No, wait, maybe I should do a variance within the $fromdates in case
some stupid MTA server has a bad clock?
Any advice?
Anybody got a good "variance" function to do what I'm trying to do?
Am I on the entirely wrong path here?
Sheesh!
We may just ignore any obviously wrong dates, and process those by
hand...
--
Some people have a "gift" link here.
Know what I want?
I want you to buy a CD from some starving artist.
http://cdbaby.com/browse/from/lynch
Yeah, I get a buck. So?
--
PHP General Mailing List (http://www.php.net/)
To unsubscribe, visit: http://www.php.net/unsub.php