Hi All, On the same data points x=c(46, 125 , 36 ,193, 209, 78, 66, 242 , 297,45 )
I want to have have the following output as data frame x group group mean 46 1 42.3 125 2 89.6 36 1 42.3 193 3 235.25 209 3 235.25 78 2 89.6 66 2 89.6 242 3 235.25 297 3 235.25 45 1 42.3 I tried the following code dat <- data.frame(xc=split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)))) gxc <- with(dat, tapply(xc, group, mean)) dat$gxc <- gxce[as.character(dat$group)] txc=dat$gxc it did not work for me. On Tue, Apr 3, 2012 at 10:15 AM, David Winsemius <dwinsem...@comcast.net>wrote: > > On Apr 3, 2012, at 10:11 AM, Val wrote: > > David W and all, > > Thank you very much for your help. > > Here is the final output that I want in the form of data frame. The data > frame should contain x, group and group_ mean in the following way > > x group group mean > 46 1 42.3 > 125 2 89.6 > 36 1 42.3 > 193 3 235.25 > 209 3 235.25 > 78 2 89.6 > 66 2 89.6 > 242 3 235.25 > 297 3 235.25 > 45 1 42.3 > > > I you want group means in a vector the same length as x then instead of > using tapply as done in earlier solutions you should use `ave`. > > -- > DW > > > > Thanks a lot > > > > > > > > > On Tue, Apr 3, 2012 at 9:51 AM, David Winsemius <dwinsem...@comcast.net>wrote: > >> >> On Apr 3, 2012, at 9:32 AM, R. Michael Weylandt wrote: >> >> Use cut2 as I suggested and David demonstrated. >>> >> >> Agree that Hmisc::cut2 is extremely handy and I also like that fact that >> the closed ends of intervals are on the left side (which is not the same >> behavior as cut()), which has the otehr effect of setting include.lowest = >> TRUE which is not the default for cut() either (to my continued amazement). >> >> But let me add the method I use when doing it "by hand": >> >> cut(x, quantile(x, prob=seq(0, 1, length=ngrps+1)), include.lowest=TRUE) >> >> -- >> David. >> >> >> >> >>> Michael >>> >>> On Tue, Apr 3, 2012 at 9:31 AM, Val <valkr...@gmail.com> wrote: >>> >>>> Thank you all (David, Michael, Giovanni) for your prompt response. >>>> >>>> First there was a typo error for the group mean it was 89.6 not 87. >>>> >>>> For a small data set and few groupings I can use prob=c(0, .333, .66 >>>> ,1) to >>>> group in to three groups in this case. However, if I want to extend the >>>> number of groupings say 10 or 15 then do I have to figure it out the >>>> split(x, cut(x, quantile(x, prob=c(0, .333, .66 ,1)) >>>> >>>> Is there a short cut for that? >>>> >>>> >>>> Thanks >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> >>>> On Tue, Apr 3, 2012 at 9:13 AM, R. Michael Weylandt >>>> <michael.weyla...@gmail.com> wrote: >>>> >>>>> >>>>> Ignoring the fact your desired answers are wrong, I'd split the >>>>> separating part and the group means parts into three steps: >>>>> >>>>> i) quantile() can help you get the split points, >>>>> ii) findInterval() can assign each y to a group >>>>> iii) then ave() or tapply() will do group-wise means >>>>> >>>>> Something like: >>>>> >>>>> y <- c(36, 45, 46, 66, 78, 125, 193, 209, 242, 297) # You need a "c" >>>>> here. >>>>> ave(y, findInterval(y, quantile(y, c(0.33, 0.66)))) >>>>> tapply(y, findInterval(y, quantile(y, c(0.33, 0.66))), mean) >>>>> >>>>> You could also use cut2 from the Hmisc package to combine findInterval >>>>> and quantile into a single step. >>>>> >>>>> Depending on your desired output. >>>>> >>>>> Hope that helps, >>>>> Michael >>>>> >>>>> On Tue, Apr 3, 2012 at 8:47 AM, Val <valkr...@gmail.com> wrote: >>>>> >>>>>> Hi all, >>>>>> >>>>>> Assume that I have the following 10 data points. >>>>>> x=c( 46, 125 , 36 ,193, 209, 78, 66, 242 , 297 , 45) >>>>>> >>>>>> sort x and get the following >>>>>> y= (36 , 45 , 46, 66, 78, 125,193, 209, 242, 297) >>>>>> >>>>>> I want to group the sorted data point (y) into equal number of >>>>>> observation per group. In this case there will be three groups. The >>>>>> first >>>>>> two groups will have three observation and the third will have four >>>>>> observations >>>>>> >>>>>> group 1 = 34, 45, 46 >>>>>> group 2 = 66, 78, 125 >>>>>> group 3 = 193, 209, 242,297 >>>>>> >>>>>> Finally I want to calculate the group mean >>>>>> >>>>>> group 1 = 42 >>>>>> group 2 = 87 >>>>>> group 3 = 234 >>>>>> >>>>>> Can anyone help me out? >>>>>> >>>>>> In SAS I used to do it using proc rank. >>>>>> >>>>>> thanks in advance >>>>>> >>>>>> Val >>>>>> >>>>>> [[alternative HTML version deleted]] >>>>>> >>>>> >>>>> >>>>>> ______________________________**________________ >>>>>> R-help@r-project.org mailing list >>>>>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> >>>>>> PLEASE do read the posting guide >>>>>> http://www.R-project.org/**posting-guide.html<http://www.R-project.org/posting-guide.html> >>>>>> and provide commented, minimal, self-contained, reproducible code. >>>>>> >>>>> >>>> >>>> >>> ______________________________**________________ >>> R-help@r-project.org mailing list >>> https://stat.ethz.ch/mailman/**listinfo/r-help<https://stat.ethz.ch/mailman/listinfo/r-help> >>> PLEASE do read the posting guide http://www.R-project.org/** >>> posting-guide.html <http://www.R-project.org/posting-guide.html> >>> and provide commented, minimal, self-contained, reproducible code. >>> >> >> David Winsemius, MD >> West Hartford, CT >> >> > > David Winsemius, MD > West Hartford, CT > > [[alternative HTML version deleted]] ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.