On 04/11/2011 2:04 PM, Jesse Brown wrote:
Hello:
I'm dealing with an issue currently that I'm not sure the best way to
approach. I've got a very large (10G+) dataset that I'm trying to create
a histogram for. I don't seem to be able to use hist directly as I can
not create an R vector of size greater than 2.2G. I considered
condensing the data previous to loading it into R and just plotting
the frequencies as a barplot; unfortunately, barplot does not support
plotting the values according to a set of x-axis positions.
What I have is something similar to:
ys<- c(12,3,7,22,10)
xs<- c(1,30,35,39,60)
and I'd like the bars (ys) to appear at the positions described by xs. I
can get this to work on smaller sets by filling zero values in for
missing ys for the entire range of xs but in my case this would again
create a vector too large for R.
Is there another way to use the two vectors to create a simulated
frequency histogram? Is there a way to create a histogram object (as
returned by hist) from the condensed data so that plot would handle it
correctly?
Follow your own last suggestion. Take a small subset of your data, and
calculate
x <- hist(data, plot=FALSE)
str(x) will show you the structure of the object in x. Modify the
entries to reflect your full dataset, and then
plot(x)
will show it.
Duncan Murdoch
______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.