David, Thanks a lot for the specific suggestions. That’s very helpful. My question 1 is fully answered now. I guess I am not clear enough for my question 2. I would like to generate a random sample using the estimated probability density (as a result of my question 1) as the reference distribution. Say, I get a matrix of the estimated density (at some grid points) using MASS::kde2d. How can I use that result as a reference distribution to sample data from? I know it is a trivial issue for parametric distributions like bivariate normal, but what about such a nonparametric bivariate reference distribution? Any particular procedures or functions I can use? The reason I don’t want to use sampling (with replacement, I can sample more data than I have without replacement), as this will generate lots of duplicate data points, if I want to generated bigger dataset yet my raw data do not have a big sample size. The scatter plot of the sampled data doesn’t look good this way. Heyi
--- On Fri, 3/23/12, David Winsemius <[email protected]> wrote: > From: David Winsemius <[email protected]> > Subject: Re: [R] Nonparametric bivariate distribution estimation and sampling > To: "heyi xiao" <[email protected]> > Cc: "Sarah Goslee" <[email protected]>, [email protected] > Date: Friday, March 23, 2012, 2:20 PM > > On Mar 23, 2012, at 1:53 PM, heyi xiao wrote: > > > Sarah, > > Thanks for the response. I actually have several years > of working experience with R and statistics, although may > not be as good as you. that’s why I am here ;) I dug > deeper into R documentations and previous R-help posts, and > couldn’t found anything particular help. Again, I want to > do two things: (1) estimate the probability density of this > bivariate distribution using some nonparametric method > (kernel, spline etc); > > ?MASS::kde2d > ?KernSmooth::bkde2D > ?ade4::s.kde2d > help(package=locfit) > > > (2) sample a big dataset from this bivariate > distribution for a simulation study. > > What is wrong with `sample`? > > # to get sample of size n without replacement > set.seed(42) > dfrm[ sample(1:NROW(dfrm), n) , ] > > --David. > > If my questions are not clear enough show my how I can > improve, or which part is not clear enough. If you have any > particular suggestions/comments, you are more than welcome. > Thanks! > > Heyi > > > > > > --- On Fri, 3/23/12, Sarah Goslee <[email protected]> > wrote: > > > >> From: Sarah Goslee <[email protected]> > >> Subject: Re: [R] Nonparametric bivariate > distribution estimation and sampling > >> To: "heyi xiao" <[email protected]> > >> Cc: [email protected] > >> Date: Friday, March 23, 2012, 12:26 PM > >> R can do all of that and more. > >> > >> But you'll need to put some work in reading about > how to use > >> R, about > >> the statistical methods involved, and about how to > use them > >> to best > >> effect. You might want, for instance, generalized > additive > >> models. Or > >> not. If your question isn't more fully-formed than > this, > >> your best bet > >> is almost certainly to talk to a local > statistician, spend > >> some time > >> working with R, and then come back to the list > with > >> specific > >> questions. > >> > >> Sarah > >> > >> On Fri, Mar 23, 2012 at 12:17 PM, heyi xiao <[email protected]> > >> wrote: > >>> Dear all, > >>> I have a bivariate dataset from a preliminary > study. I > >> want to do two things: (1) estimate the probability > density > >> of this bivariate distribution using some > nonparametric > >> method (kernel, spline etc); (2) sample a big > dataset from > >> this bivariate distribution for a simulation > study. > >>> Is there any good method or package I can use > in R for > >> my work? I don’t want parametric models like > bivariate > >> normal distribution etc, as I would like to > accurate model > >> my data. I don’t want to use the bootstrapping > approach, > >> i.e. sampling with replacement, as this will > generate lots > >> of duplicate data points. Any thoughts or input > will be > >> highly appreciated! > >>> Heyi > >>> > >>> > >> > >> --Sarah Goslee > >> http://www.functionaldiversity.org > >> > > > > ______________________________________________ > > [email protected] > mailing list > > https://stat.ethz.ch/mailman/listinfo/r-help > > PLEASE do read the posting guide http://www.R-project.org/posting-guide.html > > and provide commented, minimal, self-contained, > reproducible code. > > David Winsemius, MD > West Hartford, CT > > ______________________________________________ [email protected] mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.

