I think this should work rgmm <- function(n, gmm) { M <- sample(1:4, n, replace = TRUE, prob= gmm$weight) mean <- gmm[M, ]$mean sd <- gmm[M, ]$sd
return(gmm[M,]$sd*rnorm(n) + gmm[M,]$mean) } hist(rgmm(10000, gmm), breaks = 500) On Dec 19, 4:14 pm, "Bill McNeill (UW)" <bill...@u.washington.edu> wrote: > I am trying to generate a set of data points from a Gaussian mixture > model. My mixture model is represented by a data frame that looks > like this: > > > gmm > > weight mean sd > 1 0.3 0 1.0 > 2 0.2 -2 0.5 > 3 0.4 4 0.7 > 4 0.1 5 0.3 > > I have written the following function that generates the appropriate data: > > gmm_data <- function(n, gmm) { > c(rnorm(n*gmm[1,]$weight, gmm[1,]$mean, gmm[1,]$sd), > rnorm(n*gmm[2,]$weight, gmm[2,]$mean, gmm[2,]$sd), > rnorm(n*gmm[3,]$weight, gmm[3,]$mean, gmm[3,]$sd), > rnorm(n*gmm[4,]$weight, gmm[4,]$mean, gmm[4,]$sd)) > > } > > However, the fact that my mixture has four components is hard-coded > into this function. A better implementation of gmm_data() would > generate data points for an arbitrary number of mixture components > (i.e. an arbitrary number of rows in the data frame). > > How do I do this? I'm sure it's simple, but I can't figure it out. > > Thanks. > -- > Bill McNeillhttp://staff.washington.edu/billmcn/index.shtml > > ______________________________________________ > r-h...@r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help > PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html > and provide commented, minimal, self-contained, reproducible code. ______________________________________________ R-help@r-project.org mailing list https://stat.ethz.ch/mailman/listinfo/r-help PLEASE do read the posting guide http://www.R-project.org/posting-guide.html and provide commented, minimal, self-contained, reproducible code.