Hmm, smooth the chart makes me think you are trying to find the trends:
require(ggplot2) ggplot(mtcars, aes(mpg, hp)) + geom_point() + stat_smooth() Try it out and see what you think---it adds a locally smoothed line that does something like trace the means (that is a very over simplification, but the gist of it). Cheers, Josh On Fri, Mar 9, 2012 at 7:00 PM, Michael <comtech....@gmail.com> wrote: > The origin of this problem was that a plain scatter plot with too many > points with high dispersion generated too many points flying all over > places. > > We are trying to smooth the charts a bit... > > Any good recommendations? > > Thanks a lot! > > On Fri, Mar 9, 2012 at 8:59 PM, Michael <comtech....@gmail.com> wrote: > >> Sorry for the confusion Michael. >> >> I myself am trying to figure out what my boss is requesting: >> >> I am certain that I need to "plot the quantiles of each bin. " ... >> >> But how are the quantiles plotted? Shall I specify 50% quantile, etc? >> >> Being a diligent guy I am trying my hard to do some homework and figure it >> out myself... >> >> I thought there is a standard statistical prodedure that everybody knows... >> >> Any more thoughts? >> >> Thanks a lot! >> >> >> On Fri, Mar 9, 2012 at 8:51 PM, R. Michael Weylandt < >> michael.weyla...@gmail.com> wrote: >> >>> On Fri, Mar 9, 2012 at 9:28 PM, Michael <comtech....@gmail.com> wrote: >>> > Thanks a lot Mike! >>> > >>> >>> Michael if you don't mind. (Though admittedly it leads to some degree >>> of confusion in a conversation like this) >>> >>> > Could you please explain your code a bit? >>> >>> Which part? >>> >>> > >>> > My imagination is that for each bin, I am plotting a line which is the >>> > quantile of the y-values in that bin? >>> >>> Oh, so you want a qqnorm()-esque line? How is that like a scatterplot? >>> >>> ....yes, that's something else entirely (and not clear from your first >>> post -- to my ear the "quantile" is a statistic tied to the [e]cdf) >>> This is actually much easier in ggplot (and certainly doable in base >>> as well) >>> >>> Try this, >>> >>> DAT <- data.frame(x = runif(1000, 0, 20), y = rnorm(1000)) # Not so >>> volatile this time >>> DAT$xbin <- with(DAT, cut(x, seq(0, 20, 5))) >>> >>> library(ggplot2) >>> p <- ggplot(DAT) + facet_wrap( ~ xbin) + stat_qq(aes(sample = y)) >>> >>> print(p) >>> >>> If this isn't what you want, please spend some time to show an example >>> of the sort of graph you desire (it can be a bit of code or a link to >>> a picture or even a hand sketch hosted somewhere online) >>> >>> Out on a limb, I think you might really be thinking of something more >>> like this: >>> >>> p <- ggplot(DAT) + facet_wrap( ~ xbin) + geom_step(aes(x = >>> seq_along(y), y = sort(y))) >>> >>> and see this for more: http://had.co.nz/ggplot2/geom_step.html >>> >>> Michael Weylandt >>> >>> > >>> > I ran your program but couldn't figure out the meaning of the dots in >>> your >>> > plot? >>> > >>> > Thanks again! >>> > >>> > On Fri, Mar 9, 2012 at 7:07 PM, R. Michael Weylandt >>> > <michael.weyla...@gmail.com> wrote: >>> >> >>> >> That doesn't really seem to make sense to me as a graphical >>> >> representation (transforming adjacent y values differently), but if >>> >> you really want to do so, here's what I'd do if I understand your goal >>> >> (the preprocessing is independent of the graphics engine): >>> >> >>> >> DAT <- data.frame(x = runif(1000, 0, 20), y = rcauchy(1000)^2) # Nice >>> >> and volatile! >>> >> >>> >> # split y based on some x binning and assign empirical quantiles of >>> each >>> >> group >>> >> >>> >> DAT$yquant <- with(DAT, ave(y, cut(x, seq(0, 20, 5)), FUN = >>> >> function(x) ecdf(x)(x))) >>> >> >>> >> # BASE >>> >> plot(yquant ~ x, data = DAT) >>> >> >>> >> # ggplot2 >>> >> library(ggplot2) >>> >> >>> >> p <- ggplot(DAT, aes(x = x, y = yquant)) + geom_point() >>> >> print(p) >>> >> >>> >> Michael Weylandt >>> >> >>> >> PS -- I see Josh Wiley just responded pointing out your requirements >>> >> #1 and #2 are incompatible: I've used 1 here. >>> >> >>> >> On Fri, Mar 9, 2012 at 7:37 PM, Michael <comtech....@gmail.com> wrote: >>> >> > Hi all, >>> >> > >>> >> > I am trying hard to do the following and have already spent a few >>> hours >>> >> > in >>> >> > vain: >>> >> > >>> >> > I wanted to do the scatter plot. >>> >> > >>> >> > But given the high dispersion on those dots, I would like to bin the >>> >> > x-axis >>> >> > and then for each bin of the x-axis, plot the quantiles of the >>> y-values >>> >> > of >>> >> > the data points in each bin: >>> >> > >>> >> > 1. Uniform bin size on the x-axis; >>> >> > 2. Equal number of observations in each bin; >>> >> > >>> >> > How to do that in R? I guess for the sake of prettyness, I'd better >>> do >>> >> > it >>> >> > in ggplot2? >>> >> > >>> >> > Thank you! >>> >> > >>> >> > [[alternative HTML version deleted]] >>> >> > >>> >> > ______________________________________________ >>> >> > R-help@r-project.org mailing list >>> >> > https://stat.ethz.ch/mailman/listinfo/r-help >>> >> > PLEASE do read the posting guide >>> >> > http://www.R-project.org/posting-guide.html<http://www.r-project.org/posting-guide.html> >>> >> > and provide commented, minimal, self-contained, reproducible code. >>> > >>> > >>> >> >> > > [[alternative HTML version deleted]] > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. -- Joshua Wiley Ph.D. Student, Health Psychology Programmer Analyst II, Statistical Consulting Group University of California, Los Angeles https://joshuawiley.com/ ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.