Re: [R] Data.frame manipulation

Dennis Murphy Thu, 28 Jan 2010 09:59:41 -0800

Hi:


On Thu, Jan 28, 2010 at 8:40 AM, AC Del Re <de...@wisc.edu> wrote:

> Thank you, Dennis and Petr.
>
> One more question:  when aggregating to one es per id, how would I go about
> keeping the other variables in the data.frame (e.g., keeping the value for
> the first row of the other variables, such as mod2) e.g.:
>
> # Dennis provided this example (notice how mod2 is removed from the
> output):
>
> > with(x, aggregate(list(es = es), by = list(id = id, mod1 = mod1), mean))
>   id mod1   es
> 1  3    1 0.20
> 2  1    2 0.30
> 3  2    4 0.15
>
> # How can I get this output (taking the first row of the other variable in
> the data.frame):
>
> id  es   mod1  mod2
>
> 1  .30     2        wai
> 2  .15     4        other
> 3  .20     1         itas
>

Using ddply from the plyr package:

> ddply(x, .(id, mod1), summarize, es = mean(es), mod2 = head(mod2, 1))
  id mod1   es  mod2
1  1    2 0.30   wai
2  2    4 0.15 other
3  3    1 0.20  itas

mod2 = head(...)  selects the first instance of mod2 in each id/mod1
combination.

It appears from the help page that aggregate only allows one summary
function
per call; if so, it wouldn't be able to do this. You could, however, do this
in the
doBy package with a custom summary function.

HTH,
Dennis

>
>
> Thank you,
>
> AC
>
>
> On Thu, Jan 28, 2010 at 1:29 AM, Petr PIKAL <petr.pi...@precheza.cz>wrote:
>
>> HI
>>
>> r-help-boun...@r-project.org napsal dne 28.01.2010 04:35:29:
>>
>> > > Hi All,
>> > >
>> > > I'm conducting a meta-analysis and have taken a data.frame with
>> multiple
>> > > rows per
>> > > study (for each effect size) and performed a weighted average of
>> effect
>> > > size for
>> > > each study. This results in a reduced # of rows. I am particularly
>> > > interested in
>> > > simply reducing the additional variables in the data.frame to the
>> first row
>> > > of the
>> > > corresponding id variable. For example:
>> > >
>> > > id<-c(1,2,2,3,3,3)
>> > > es<-c(.3,.1,.3,.1,.2,.3)
>> > > mod1<-c(2,4,4,1,1,1)
>> > > mod2<-c("wai","other","calpas","wai","itas","other")
>> > > data<-as.data.frame(cbind(id,es,mod1,mod2))
>>
>> Do not use cbind. Its output is a matrix and in this case character
>> matrix. Resulting data frame will consist from factors as you can check by
>>
>>
>> str(data)
>>
>> data<-data.frame(id=id,es=es,mod1=mod1,mod2=mod2)
>>
>>
>> > >
>> > > data
>> > >
>> > >    id   es    mod1 mod2
>> > > 1  1   0.3    2     wai
>> > > 2  2   0.1    4     other
>> > > 3  2   0.2    4     calpas
>> > > 4  3   0.1    1     itas
>> > > 5  3   0.2    1     wai
>> > > 6  3   0.3    1     wai
>> > >
>> > > # I would like to reduce the entire data.frame like this:
>>
>> E.g. aggregate
>>
>> aggregate(data[, -(3:4)], data[,3:4], mean)
>>  mod1   mod2 id  es
>> 1    4 calpas  2 0.3
>> 2    1   itas  3 0.2
>> 3    1  other  3 0.3
>> 4    4  other  2 0.1
>> 5    1    wai  3 0.1
>> 6    2    wai  1 0.3
>>
>> doBy or tapply or ddply from plyr library or ....
>>
>> Regards
>> Petr
>>
>> > >
>> > > id  es   mod1  mod2
>> > >
>> > > 1  .30     2        wai
>> > > 2  .15     4        other
>> > > 3  .20     1         itas
>> > >
>> > > # If possible, I would also like the option of this (collapsing on id
>> and
>> > > mod2):
>> > >
>> > > id  es   mod1  mod2
>> > > 1  .30      2        wai
>> > > 2   0.1     4       other
>> > > 2   0.2      4        calpas
>> > > 3   0.1     1         itas
>> > > 3   0.25    1         wai
>> > >
>> > > Any help is much appreciated!
>> > >
>> > > AC Del Re
>> > >
>> >
>> >    [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help@r-project.org mailing list
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>>
>

        [[alternative HTML version deleted]]

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Data.frame manipulation

Reply via email to