Re: [R] reshape

jim holtman Wed, 20 Jun 2012 09:53:34 -0700

Is this closer using the same data:

> dcast(x, year+month+day+hhmm+group+number ~ type, mean)
    year month day hhmm group number         Tair           VWC
1   2007     1   1 1100     B      1  0.285404913  1.4337114478
2   2007     1   1 1100     B      2 -0.899893189  1.1004122756
3   2007     1   1 1100     T      1 -0.054938988  0.0565570314
4   2007     1   1 1100     T     10          NaN -0.4645357924
5   2007     1   1 1100     T      2  0.235724005  0.1355795323
6   2007     1   1 1101     B      1  0.008141599  0.2010562273
7   2007     1   1 1101     B      2 -0.094439009  0.7713080532
8   2007     1   1 1101     T      1  1.111981883 -0.5022737361
9   2007     1   1 1101     T     10          NaN  0.3061178598
10  2007     1   1 1101     T      2  0.101879382 -0.1886613029
11  2007     1   2 1100     B      1 -2.189530590  0.0741563102
12  2007     1   2 1100     B      2  0.290204966  1.4341878509
13  2007     1   2 1100     T      1  0.788716089  1.9156394486
14  2007     1   2 1100     T     10          NaN  1.9814202666
15  2007     1   2 1100     T      2  0.255978008  0.0604626563
16  2007     1   2 1101     B      1 -0.479254067  0.7770438119
17  2007     1   2 1101     B      2  0.526466448 -0.1691557783
18  2007     1   2 1101     T      1  2.070198634  0.1129769772
19  2007     1   2 1101     T     10          NaN  1.5263077972
20  2007     1   2 1101     T      2 -0.883571883 -0.5357702486



On Wed, Jun 20, 2012 at 12:33 PM, nrm2010 <nrm2...@zoho.com> wrote:
>
> Thank you, but it doesn't quite.  It still needs a new column for each 
> measurement type,
> i.e., a VWC column that contains the VWC measurements for each subject within
> group at each time point and a Tair column that contains the
> Tair measurements for each subject within group at each time point.
>
> Toby
>
>
>
>
> ---- On Wed, 20 Jun 2012 09:25:08 -0700 jim holtman  wrote ----
>
>>will this do it for you:
>>
>>
>>> require(reshape2)
>>> toy <- data.frame(year = rep(2007:2008,each = 20), month = rep(1:5,each = 
>>> 4, length = 40),
>>+ day = rep(1:2,each = 2,length = 40), hhmm = rep(1100:1101,length =
>>40), plot = rep(1,40),
>>+ trt=rep(4,40), VWC.B1 = rnorm(40), VWC.B2 = rnorm(40), VWC.T1 = rnorm(40),
>>+ VWC.T2 = rnorm(40), VWC.T10 = rnorm(40), Tair.B1 = rnorm(40),Tair.B2
>>= rnorm(40),
>>+ Tair.T1 = rnorm(40), Tair.T2 = rnorm(40))
>>>
>>> x <- melt(toy, id = c('year', 'month', 'day', 'hhmm', 'plot', 'trt'))
>>>
>>> # parse the type, group and number
>>> x$type <- sub("\..*$", "", x$variable)
>>> x$group <- sub(".*\.(.).*", "\1", x$variable)
>>> x$number <- sub(".*\..", "", x$variable)
>>>
>>> head(x)
>> year month day hhmm plot trt variable value type group number
>>1 2007 1 1 1100 1 4 VWC.B1 0.35680152 VWC B 1
>>2 2007 1 1 1101 1 4 VWC.B1 -0.01787670 VWC B 1
>>3 2007 1 2 1100 1 4 VWC.B1 -0.07361414 VWC B 1
>>4 2007 1 2 1101 1 4 VWC.B1 0.90117558 VWC B 1
>>5 2007 2 1 1100 1 4 VWC.B1 0.64824677 VWC B 1
>>6 2007 2 1 1101 1 4 VWC.B1 0.26769079 VWC B 1
>>>
>>
>>
>>On Wed, Jun 20, 2012 at 12:08 PM, nrm2010  wrote:
>>> Hello, helpeRs,
>>>
>>> I am attempting to reshape (either base R or package reshape) multiple .csv
>>> spreadsheets from a very unfortunate wide format to long format.  Each 
>>> spreadsheet
>>> looks something like this, after being read in to R:
>>>
>>> toy <- data.frame(year = rep(2007:2008,each = 20), month = rep(1:5,each = 
>>> 4, length = 40),
>>> day = rep(1:2,each = 2,length = 40), hhmm = rep(1100:1101,length = 40), 
>>> plot = rep(1,40),
>>> trt=rep(4,40), VWC.B1 = rnorm(40), VWC.B2 = rnorm(40), VWC.T1 = rnorm(40),
>>> VWC.T2 = rnorm(40), VWC.T10 = rnorm(40), Tair.B1 = rnorm(40),Tair.B2 = 
>>> rnorm(40),
>>> Tair.T1 =  rnorm(40), Tair.T2 = rnorm(40))
>>>
>>> Taken together, the columns year, month, day, and hhmm define a timepoint
>>> at which at a measurement was taken.  Plot and treatment do not change 
>>> within
>>> any one spreadsheet.  Parsing the column names, VWC and Tair are 2 different
>>> types of measurements; in the full dataset, there are more than 2 types of
>>> measurements.  The B and T in the column names are grouping variables.  The
>>> numbers following the B or T are subjects within groups.  So, if a column 
>>> name
>>> is VWC.B1, it is the VWC measurement taken on subject 1 within group B.
>>>
>>> I am trying to reshape to long format. The long format should have one new 
>>> column
>>> for each measurement type (VWC and Tair in the toy example), a new column
>>> indicating the grouping variable (B or T), and a new column identifyng the
>>> subject (the number that follow the B or T). I've been unable to come up 
>>> with
>>> a solution that produces the group and subject within group columns as well
>>> as both types of measurements.
>>>
>>> Any assistance would be greatly appreciated.
>>>
>>> Thank you.
>>>
>>> Toby
>>>
>>> Toby Gass, Ph.D.
>>> Visiting Lecturer, Forestry and Wildland Resources
>>> Humboldt State University
>>> email: toby.gass  humboldt  edu
>>>
>>> ______________________________________________
>>> R-help@r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>> and provide commented, minimal, self-contained, reproducible code.
>>
>>
>>
>>--
>>Jim Holtman
>>Data Munger Guru
>>
>>What is the problem that you are trying to solve?
>>Tell me what you want to do, not how you want to do it.
>>
>



-- 
Jim Holtman
Data Munger Guru

What is the problem that you are trying to solve?
Tell me what you want to do, not how you want to do it.

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] reshape

Reply via email to