Re: [R] grouping

David Winsemius Tue, 03 Apr 2012 07:17:29 -0700

On Apr 3, 2012, at 10:11 AM, Val wrote:

> David W and all,
>
> Thank you very much for your help.
>
> Here is the final output that I want in the form of data frame. The  
> data frame should contain  x, group and group_ mean in the following  
> way
>
> x       group   group mean
> 46       1        42.3
> 125     2        89.6
> 36       1        42.3
> 193     3        235.25
> 209     3        235.25
> 78       2        89.6
> 66       2        89.6
> 242     3        235.25
> 297     3        235.25
> 45       1        42.3


I you want group means in a vector the same length as x then instead  
of using tapply as done in earlier solutions you should use `ave`.

-- 
DW


>
> Thanks a lot
>
>
>
>
>
>
>
>
> On Tue, Apr 3, 2012 at 9:51 AM, David Winsemius <[email protected] 
> > wrote:
>
> On Apr 3, 2012, at 9:32 AM, R. Michael Weylandt wrote:
>
> Use cut2 as I suggested and David demonstrated.
>
> Agree that Hmisc::cut2 is extremely handy and I also like that fact  
> that the closed ends of intervals are on the left side (which is not  
> the same behavior as cut()), which has the otehr effect of setting  
> include.lowest = TRUE which is not the default for cut() either (to  
> my continued amazement).
>
> But let me add the method I use when doing it "by hand":
>
> cut(x, quantile(x, prob=seq(0, 1, length=ngrps+1)),  
> include.lowest=TRUE)
>
> -- 
> David.
>
>
>
>
> Michael
>
> On Tue, Apr 3, 2012 at 9:31 AM, Val <[email protected]> wrote:
> Thank you all (David, Michael, Giovanni)  for your prompt response.
>
> First there was a typo error for the group mean it was 89.6 not 87.
>
> For a small data set and few groupings I can use  prob=c(0, .333, . 
> 66 ,1) to
> group in to three groups in this case. However,  if I want to extend  
> the
> number of groupings say 10 or 15 then do I have to figure it out the
>  split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))
>
> Is there a short cut for that?
>
>
> Thanks
>
>
>
>
>
>
>
>
>
>
>
> On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt
> <[email protected]> wrote:
>
> Ignoring the fact your desired answers are wrong, I'd split the
> separating part and the group means parts into three steps:
>
> i) quantile() can help you get the split points,
> ii)  findInterval() can assign each y to a group
> iii) then ave() or tapply() will do group-wise means
>
> Something like:
>
> y <- c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a "c"  
> here.
> ave(y, findInterval(y, quantile(y, c(0.33, 0.66))))
> tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)
>
> You could also use cut2 from the Hmisc package to combine findInterval
> and quantile into a single step.
>
> Depending on your desired output.
>
> Hope that helps,
> Michael
>
> On Tue, Apr 3, 2012 at 8:47 AM, Val <[email protected]> wrote:
> Hi all,
>
> Assume that I have the following 10 data points.
>  x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)
>
> sort x  and get the following
>  y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)
>
> I want to  group the sorted  data point (y)  into  equal number of
> observation per group. In this case there will be three groups.  The
> first
> two groups  will have three observation  and the third will have four
> observations
>
> group 1  = 34, 45, 46
> group 2  = 66, 78, 125
> group 3  = 193, 209, 242,297
>
> Finally I want to calculate the group mean
>
> group 1  =  42
> group 2  =  87
> group 3  =  234
>
> Can anyone help me out?
>
> In SAS I used to do it using proc rank.
>
> thanks in advance
>
> Val
>
>       [[alternative HTML version deleted]]
>
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide
> http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
>
>
> ______________________________________________
> [email protected] mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>
> David Winsemius, MD
> West Hartford, CT
>
>

David Winsemius, MD
West Hartford, CT


        [[alternative HTML version deleted]]

______________________________________________
[email protected] mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grouping

Reply via email to