On 04/11/2011 2:04 PM, Jesse Brown wrote:
Hello:

I'm dealing with an issue currently that I'm not sure the best way to
approach. I've got a very large (10G+) dataset that I'm trying to create
a histogram for. I don't seem to be able to use hist directly as I can
not create an R vector of size greater than 2.2G. I considered
condensing the data  previous to loading it into R  and just plotting
the frequencies as a barplot; unfortunately, barplot does not support
plotting the values according to a set of x-axis positions.

What I have is something similar to:

ys<- c(12,3,7,22,10)
xs<- c(1,30,35,39,60)

and I'd like the bars (ys) to appear at the positions described by xs. I
can get this to work on smaller sets by filling zero values in for
missing ys for the entire range of xs but in my case this would again
create a vector too large for R.

Is there another way to use the two vectors to create a simulated
frequency histogram? Is there a way to create a histogram object (as
returned by hist) from the condensed data so that plot would handle it
correctly?

Follow your own last suggestion. Take a small subset of your data, and calculate

x <- hist(data, plot=FALSE)

str(x) will show you the structure of the object in x. Modify the entries to reflect your full dataset, and then

plot(x)

will show it.

Duncan Murdoch

______________________________________________
R-help@r-project.org mailing list
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.

Reply via email to