On Apr 3, 2012, at 9:32 AM, R. Michael Weylandt wrote:
Use cut2 as I suggested and David demonstrated.
Agree that Hmisc::cut2 is extremely handy and I also like that fact
that the closed ends of intervals are on the left side (which is not
the same behavior as cut()), which has the otehr effect of setting
include.lowest = TRUE which is not the default for cut() either (to my
continued amazement).
But let me add the method I use when doing it "by hand":
cut(x, quantile(x, prob=seq(0, 1, length=ngrps+1)), include.lowest=TRUE)
--
David.
Michael
On Tue, Apr 3, 2012 at 9:31 AM, Val <valkr...@gmail.com> wrote:
Thank you all (David, Michael, Giovanni) for your prompt response.
First there was a typo error for the group mean it was 89.6 not 87.
For a small data set and few groupings I can use prob=c(0, .333, .
66 ,1) to
group in to three groups in this case. However, if I want to
extend the
number of groupings say 10 or 15 then do I have to figure it out the
split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1))
Is there a short cut for that?
Thanks
On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt
<michael.weyla...@gmail.com> wrote:
Ignoring the fact your desired answers are wrong, I'd split the
separating part and the group means parts into three steps:
i) quantile() can help you get the split points,
ii) findInterval() can assign each y to a group
iii) then ave() or tapply() will do group-wise means
Something like:
y <- c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a
"c" here.
ave(y, findInterval(y, quantile(y, c(0.33, 0.66))))
tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean)
You could also use cut2 from the Hmisc package to combine
findInterval
and quantile into a single step.
Depending on your desired output.
Hope that helps,
Michael
On Tue, Apr 3, 2012 at 8:47 AM, Val <valkr...@gmail.com> wrote:
Hi all,
Assume that I have the following 10 data points.
x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45)
sort x and get the following
y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297)
I want to group the sorted data point (y) into equal number of
observation per group. In this case there will be three groups.
The
first
two groups will have three observation and the third will have
four
observations
group 1 = 34, 45, 46
group 2 = 66, 78, 125
group 3 = 193, 209, 242,297
Finally I want to calculate the group mean
group 1 = 42
group 2 = 87
group 3 = 234
Can anyone help me out?
In SAS I used to do it using proc rank.
thanks in advance
Val
[[alternative HTML version deleted]]
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide
http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.
David Winsemius, MD
West Hartford, CT
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.