Actually I just learned something myself that you can do on the dataset *without* the additional step in Excel.. I changed the format in strptime to match the format (d'oh!!!!) and whala:

 x
  Subject      Date     Time Value
1       1 7/23/2003 13:05:00    84
2       1 7/23/2003 13:10:00    87
3       1 7/23/2003 13:15:00    95
4       2 9/25/2004 14:34:00    95
5       2 9/25/2004 14:39:00    81
6       2 9/25/2004 14:44:00    93
7       3  3/2/2004 16:34:00    72
8       3  3/2/2004 16:39:00    67
9       3  3/2/2004 16:44:00    83
> dates = as.POSIXct(strptime(paste(x[,2], x[,3], sep=" "), format="%m/%d/%Y %H:%M:%S"))
> dates
[1] "2003-07-23 13:05:00 EDT" "2003-07-23 13:10:00 EDT" "2003-07-23 13:15:00 EDT" [4] "2004-09-25 14:34:00 EDT" "2004-09-25 14:39:00 EDT" "2004-09-25 14:44:00 EDT" [7] "2004-03-02 16:34:00 EST" "2004-03-02 16:39:00 EST" "2004-03-02 16:44:00 EST"

> data = xts(x[,c(1,4)], order.by=dates)
> data
                    Subject Value
2003-07-23 13:05:00       1    84
2003-07-23 13:10:00       1    87
2003-07-23 13:15:00       1    95
2004-03-02 16:34:00       3    72
2004-03-02 16:39:00       3    67
2004-03-02 16:44:00       3    83
2004-09-25 14:34:00       2    95
2004-09-25 14:39:00       2    81
2004-09-25 14:44:00       2    93



hth,
c

ps: my first message didn't make it to the list... apparently i had a bad header??
=============================
Cedrick W. Johnson
aolim) cedrickjcvgr
www.cedrickjohnson.com
New York - Chicago


On 3/11/2010 3:34 PM, Cedrick W. Johnson (CJ) wrote:
Hi Clay-

You may want to look at both the XTS package, in addition to 'strptime'
and 'as.POSIXct'

When I get datasets in Excel, what I normally do is change the date
(column) format to YYYY-mm-dd.. But that's due to my own shortcomings
with date formatting in R.

Here's a quick example:

 > x = read.csv('TestData.csv')
 > x
Subject Date Time Value
1 1 2003-07-23 13:05:00 84
2 1 2003-07-23 13:10:00 87
3 1 2003-07-23 13:15:00 95
4 2 2004-09-25 14:34:00 95
5 2 2004-09-25 14:39:00 81
6 2 2004-09-25 14:44:00 93
7 3 2004-03-02 16:34:00 72
8 3 2004-03-02 16:39:00 67
9 3 2004-03-02 16:44:00 83

dates = as.POSIXct(strptime(paste(x[,2], x[,3], sep=" "),
format="%Y-%m-%d %H:%M:%S"))


 > dates
[1] "2003-07-23 13:05:00 EDT" "2003-07-23 13:10:00 EDT" "2003-07-23
13:15:00 EDT"
[4] "2004-09-25 14:34:00 EDT" "2004-09-25 14:39:00 EDT" "2004-09-25
14:44:00 EDT"
[7] "2004-03-02 16:34:00 EST" "2004-03-02 16:39:00 EST" "2004-03-02
16:44:00 EST"

 > data = xts(x[,c(1,4)], order.by=dates)
 > data
Subject Value
2003-07-23 13:05:00 1 84
2003-07-23 13:10:00 1 87
2003-07-23 13:15:00 1 95
2004-03-02 16:34:00 3 72
2004-03-02 16:39:00 3 67
2004-03-02 16:44:00 3 83
2004-09-25 14:34:00 2 95
2004-09-25 14:39:00 2 81
2004-09-25 14:44:00 2 93


HTH

-cedrick

=============================
Cedrick Johnson
aolim) cedrickjcvgr
www.cedrickjohnson.com
New York - Chicago


On 3/11/2010 3:13 PM, Clay Heaton wrote:
Hi, I'm trying to learn R for a project I'm working on. I know several
programming languages, so I'm comfortable with the syntax. What I
can't figure out is how to import the file of time series data that I
have and parse it into individual series. The data was given to me in
Excel, but I can output it to tab-delimited or csv. I've been able to
pull in the entire table with read.table(), but I can't figure out how
to parse it into distinct groups.

It looks like this:

Subject Date Time Value
1 7/23/03 13:05:00 84
1 7/23/03 13:10:00 87
1 7/23/03 13:15:00 95
....
1 9/25/04 14:34:00 95
1 9/25/04 14:39:00 81
1 9/25/04 14:44:00 93
...
2 3/02/04 16:34:00 72
2 3/02/04 16:39:00 67
2 3/02/04 16:44:00 83
...
2 3/21/05 11:15:00 121
2 3/21/05 11:20:00 125
2 3/21/05 11:25:00 120
...

There are ~ 100,000 rows of data. There are 86 subjects and each of
them have multiple traces. For each trace, the times are in uniform
increments of 5 minutes. Some subjects have multiple traces, some have
a single trace. Some traces include up to 500 values and others only 40.

For now, what I'm looking to do is to be able to generate summary
statistics for each trace, and then for each subject. Hence, I need a
way to aggregate by value or subject, where the criteria for
aggregating traces are that the values were collected on the same day
and all are within 5 minutes of each other. I would like to be able to
iterate through the data to plot each trace independently.

Any suggestions to help me get started would be appreciated. I'm
looking to learn, so I'd appreciate pointers to good tutorials or code
examples of dealing with time series data.

Thanks!
Clay
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to