Try the logspline package: > library(logspline) > > x1 <- rgamma(1000, 3) > > br <- c(0,1,2,4,6,8,12,15) > > h1 <- cut( x1, br, include.lowest=TRUE ) > > int1 <- embed(br,2)[ as.integer(h1), 2:1 ] > > ls1 <- oldlogspline(x1, lbound=0) > ls2 <- oldlogspline( interval=int1, lbound=0 ) > > x2 <- roldlogspline( 1000, ls2 ) > > par(mfrow=c(3,1)) > hist(x1, xlim=c(0,15)) > hist(x2, xlim=c(0,15)) > > xx <- seq(0,15, length=250) > plot(xx, dgamma(xx,3), type='l') > lines(xx, doldlogspline(xx,ls1), col='blue') > lines(xx, doldlogspline(xx,ls2), col='green') >
Hope this helps, -- Gregory (Greg) L. Snow Ph.D. Statistical Data Center Intermountain Healthcare [EMAIL PROTECTED] (801) 408-8111 > -----Original Message----- > From: [EMAIL PROTECTED] > [mailto:[EMAIL PROTECTED] On Behalf Of tom sgouros > Sent: Thursday, October 11, 2007 5:30 AM > To: r-help@r-project.org > Subject: Re: [R] simulated data using empirical distribution > > > Hello all: > > Many thanks to the people who have responded to my question, > on and off-list. My problem isn't completely solved, though, > and perhaps you can help again. > > The problem, again, is that I have what is essentially a > histogram, but not the underlying data, and I want to > simulate data that would have created that histogram. That > is, I have counts for the number of data points in a dozen > bins. The bins are not of uniform size. (It's income data, > reported as incomes from 0-10k, 10k-25k, 25k-50k, and so on.) > > The best suggestion I had yesterday was to simulate the data > with uniform distributions in each bin, and an exponential > one on the rightmost bin, and I did that and superficially it > looks good. > Unfortunately, now that I am trying to calibrate the model, I > have discovered a high bias. The way the bins are chosen, I > would expect that 9 out of 12 bins have a down-ward slope, > meaning that approximating them with a square top gives me > more along the high border of the bin, and I currently > suspect that this is at least part of the bias. > > Is there a way to ask for a not-quite uniform distribution of > random data? I imagine a density function with a linear, but > not flat, top. I admit that the standard selection of > distributions in R is more than I am familiar with, but I > can't find one that does what I think I need. > > Any advice (R advice or statistics advice) is welcome. Thanks again, > > -tom > > -- > ------------------------ > tomfool at as220 dot org > http://sgouros.com > http://whatcheer.net > > ______________________________________________ > R-help@r-project.org mailing list > https://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guide > http://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. > ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.