Jim Lux
(818)354-2075 (office)
(818)395-2714 (cell)


-----Original Message-----
From: Beowulf [mailto:beowulf-boun...@beowulf.org] On Behalf Of Joe Landman
Sent: Tuesday, March 05, 2019 6:32 AM
To: beowulf@beowulf.org
Subject: Re: [Beowulf] Large amounts of data to store and process


On 3/4/19 8:00 PM, Lux, Jim (337K) via Beowulf wrote:
> I'm munging through not very much satellite telemetry (a few GByte), using 
> sqlite3..
> Here's some general observations:
> 1) if the data is recorded by multiple sensor systems, the clocks will *not* 
> align - sure they may run NTP, but....
> 2) Typically there's some sort of raw clock being recorded with the data (in 
> ticks of some oscillator, typically) - that's what you can use to put data 
> from a particular batch of sources into a time order.  And then you have the 
> problem of reconciling the different clocks.
> 3) Watch out for leap seconds in time stamps - some systems have them (UTC), 
> some do not (GPS, TAI) - a time of 23:59:60 may be legal.
> 4) you need to have a way to deal with "missing" data, whether it's time 
> tags, or actual measurements - as well as "gaps in the record"
> 5) Be aware of the need to de-dupe data - same telemetry records from 
> multiple sources.

Being satellite data, I am assuming you have relativistic corrections to the 
time, depending upon orbit, accuracy of the clock, data analysis needs, etc. . 
[1][2]

Missing data, of various types may be handled in data frame packages. R, Julia, 
and I think Python can all handle this without too much pain.

[1] https://gssc.esa.int/navipedia/index.php/Relativistic_Clock_Correction

[2] http://www.astronomy.ohio-state.edu/~pogge/Ast162/Unit5/gps.html


--
No need to deal with relativity in these cases, but propagation delay, 
certainly.
Typically, the issue is reconciling multiple telemetry streams which have 
recording rates of <1000Hz, where the systems doing the recording do not have 
synchronized clocks.

A lot of spacecraft debugging is checking "did message A leave box B and arrive 
at box C", where Box B and Box C are timestamping the events.  So you look and 
see if you have a Sending Message A on the Box B log and a Receiving Message A 
on the Box C log, in the right order, and with the right delay - if Box B is on 
the Moon and Box C is on Earth, then one expects roughly a 1 second delay.

One would like to do operations like "give me the telemetry from 12:01:00Z to 
12:03:00Z" but there might not be a GPS valid time in that range, so you have 
to estimate it from the local clock, an estimate of the clock rate, and a known 
GPS time hack outside that range. The challenge comes in where the boxes have 
clocks that run at different rates - 50 ppm error is 4 seconds/day.  Or 
multiple clocks (you might have a local clock and also a GPS, but the GPS 
doesn't always work).  

There's probably a python library to handle this, but I've not found it yet.
_______________________________________________
Beowulf mailing list, Beowulf@beowulf.org sponsored by Penguin Computing
To change your subscription (digest mode or unsubscribe) visit 
http://www.beowulf.org/mailman/listinfo/beowulf

Reply via email to