Re: [R] grouping

David Winsemius Tue, 03 Apr 2012 06:55:13 -0700


On Apr 3, 2012, at 9:32 AM, R. Michael Weylandt wrote:

Use cut2 as I suggested and David demonstrated.

Agree that Hmisc::cut2 is extremely handy and I also like that factthat the closed ends of intervals are on the left side (which is notthe same behavior as cut()), which has the otehr effect of settinginclude.lowest = TRUE which is not the default for cut() either (to mycontinued amazement).


But let me add the method I use when doing it "by hand":

cut(x, quantile(x, prob=seq(0, 1, length=ngrps+1)), include.lowest=TRUE)

--
David.


Michael

On Tue, Apr 3, 2012 at 9:31 AM, Val <valkr...@gmail.com> wrote:

Thank you all (David, Michael, Giovanni)  for your prompt response.

First there was a typo error for the group mean it was 89.6 not 87.

For a small data set and few groupings I can use prob=c(0, .333, .66 ,1) togroup in to three groups in this case. However, if I want toextend the

number of groupings say 10 or 15 then do I have to figure it out the
  split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))

Is there a short cut for that?


Thanks











On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt
<michael.weyla...@gmail.com> wrote:


Ignoring the fact your desired answers are wrong, I'd split the
separating part and the group means parts into three steps:

i) quantile() can help you get the split points,
ii)  findInterval() can assign each y to a group
iii) then ave() or tapply() will do group-wise means

Something like:

y <- c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a"c" here.

ave(y, findInterval(y, quantile(y, c(0.33, 0.66))))
tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)

You could also use cut2 from the Hmisc package to combinefindInterval

and quantile into a single step.

Depending on your desired output.

Hope that helps,
Michael

On Tue, Apr 3, 2012 at 8:47 AM, Val <valkr...@gmail.com> wrote:

Hi all,

Assume that I have the following 10 data points.
 x=c(  46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)

sort x  and get the following
 y= (36 , 45 , 46,  66, 78,  125,193, 209, 242, 297)

I want to  group the sorted  data point (y)  into  equal number of

observation per group. In this case there will be three groups.The

first

two groups will have three observation and the third will havefour

observations

group 1  = 34, 45, 46
group 2  = 66, 78, 125
group 3  = 193, 209, 242,297

Finally I want to calculate the group mean

group 1  =  42
group 2  =  87
group 3  =  234

Can anyone help me out?

In SAS I used to do it using proc rank.

thanks in advance

Val

       [[alternative HTML version deleted]]


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] grouping

Reply via email to