Re: [R] Boxplot

David Winsemius Sat, 26 Nov 2011 21:27:07 -0800


On Nov 27, 2011, at 12:15 AM, Jeffrey Joh wrote:

I'm trying to do the second case among Jim's suggestions. I usedBert's suggestion and it works great.
I would also like to ask if anyone is familiar with a package formaking box-plots. I would like to bin my datapoints at defined Xintervals and display a boxplot for each bin on the same chart.

Combining `cut` (to define the intervals) and `boxplot` should befairly straight-forward.

In Stata, there is a tool for making these, and it varies the widthof the boxplot based on the number of points in each plot.

We have a tool for that, too. Study `quantile` a bit, to automaticallypick cutpoints that will divide into approximately equal groups.

(I use the `cut2` function in the Hmisc package, because it isintegrated with `rms` that I use all the time, and because itsdefaults for cut()-ting are more to my liking. It also has a "g="parameter that automates the cut( ..., quantile(...)) processing.

I am hoping there is a similar tool for R.

Thank you,
Jeffrey

----------------------------------------
Date: Tue, 22 Nov 2011 18:51:05 +1100
From: j...@bitwrit.com.au
To: johjeff...@hotmail.com
CC: r-help@r-project.org
Subject: Re: [R] Binned line plot

On 11/22/2011 04:29 PM, Jeffrey Joh wrote:
I have a scatter plot with 10000 points. I would like to add aline that bins every 50 points and connects the average of eachbin. I'm looking for something similar to line type "m" in Stata.
With this dataset of 10000 points, I would also like to bin thedata and make boxplots at certain intervals, so that I have a setof boxplots to represent each bin. I would also like the width ofeach box to be proportional to the number of points in each bin.
How can I make these plots? Is there a simple package to use?
Hi Jeffrey,
There are three possibilities that come to mind:

1) You want to bin the points based on their order in the data frame.
2) You want to bin the points based on the x or y values of thecoordinates.
3) You want to bin the points based on the x _and_ y values of the
coordinates.
Number 1 is trivial and has already been answered (assume a twocolumn
data frame of coordinates named "xypoints").

#first point - set up a loop to get a vector of averages
meanx<-rep(0,200)
meany<-rep(0,200)
for(index in 1:200) {
start<-1+50*(index-1)
meanx[index]<-mean(xypoints[start:(start+49),"x"])
meany[index]<-mean(xypoints[start:(start+49),"y"])
}
plot(meanx,meany,type="l")
Number 2 requires that you sort the pairs based on the value of theoneyou want, then apply the same process as 1 to the sorted pairs.Number 3
is somewhat more difficult.

I don't do this much, and some of the people who do map analysis will
probably come up with a much better method.

Find the most extreme point.
Find the 49 points closest to that point to constitute group 1.
Remove those points from the data frame.
Go back to the first step if there are any points left.

You will end up with 200 groups of points that are spatially grouped.
Get the centroids and plot as above.

Another wild guess from

Jim
                        


David Winsemius, MD
West Hartford, CT

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Re: [R] Boxplot

Reply via email to