Apologies if the question is a but naïve, I am a novice in time series data 
handling in R

I have the following type of data, in a long format ( as called by the 
spacetime vignette – the table contains also space, not noted here):

User |  Date | Otherdata |
A | 01/01/2014 | aa
A | 01/01/2014 | bb
A | 01/01/2014 | cc
B | 01/01/2014 | aa
B | 05/01/2014 | cc
A | 07/01/2014 | aa
C | 05/02/2014 | xx
C | 20/02/2014 | yy

Etc
[A,B,C,…] are user Ids (some strings).
Date is converted into a Date format (2013-10-15)

The table is sorted by User and then by Date, and is over 800K records long. 
There are about 20K users.

User |  Date | Otherdata |
A | 2014-01-01 | aa
A | 2014-01-01  | bb
A | 2014-01-01  | cc
A | 2014-01-07  | aa
B | 2014-01-01  | aa
B | 2014-01-05  | cc
C | 2014-02-05  | xx
C | 2014-02-20  | yy

I want to:
Get a frequency table ( and ultimately plot) of the count of differences (in 
days) between records of a user. Meaning, I would first get the unique days 
recorded:

A | 2014-01-01
A | 2014-01-07
B | 2014-01-01
B | 2014-01-05
C | 2014-02-05
C | 2014-02-20

And then want to run the differences between timestamps within a group defined 
by the user, in days:
A| 6
B| 4
C|15

Imagining that I have tens of thousands of records, I then want the table with 
the counts of differences ( across all users) ( in our case it would be 6, 4 
and 15, all counte = 1)
IN the larger sample, something like this:
DeltaDays | Count
1 | 150
2 | 320
…
N | X

I know there are all sorts of packages for time analysis, but I could not find 
a simple function like this (incl searching here 
http://www.statoek.wiso.uni-goettingen.de/veranstaltungen/zeitreihen/sommer03/ts_r_intro.pdf
 ). I assume that something working on a simple data frame would be sufficient, 
but I am happy ( prefer?) to use TS. I would appreciate any hints. The ultimate 
analysis involves also space, so hints in the direction of space-time are 
welcome. Ultimately, I would like to separate records for each user into a 
dataset that can be handled separately, but splitting it into a large number of 
files does not seem wise. Any hint also appreciated.

Thanks,
Martin



        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to