[R] How do I generate one vector for every row of a data frame?

andrew andrewjohnroyal at gmail.com
Fri Dec 19 06:52:32 CET 2008


I think this should work

rgmm <- function(n, gmm) {
	M <- sample(1:4, n, replace = TRUE, prob= gmm$weight)
	mean <- gmm[M, ]$mean
	sd <- gmm[M, ]$sd

	return(gmm[M,]$sd*rnorm(n) + gmm[M,]$mean)
}

hist(rgmm(10000, gmm), breaks = 500)


On Dec 19, 4:14 pm, "Bill McNeill (UW)" <bill... at u.washington.edu>
wrote:
> I am trying to generate a set of data points from a Gaussian mixture
> model.  My mixture model is represented by a data frame that looks
> like this:
>
> > gmm
>
>   weight mean  sd
> 1    0.3    0 1.0
> 2    0.2   -2 0.5
> 3    0.4    4 0.7
> 4    0.1    5 0.3
>
> I have written the following function that generates the appropriate data:
>
> gmm_data <- function(n, gmm) {
>         c(rnorm(n*gmm[1,]$weight, gmm[1,]$mean, gmm[1,]$sd),
>                 rnorm(n*gmm[2,]$weight, gmm[2,]$mean, gmm[2,]$sd),
>                 rnorm(n*gmm[3,]$weight, gmm[3,]$mean, gmm[3,]$sd),
>                 rnorm(n*gmm[4,]$weight, gmm[4,]$mean, gmm[4,]$sd))
>
> }
>
> However, the fact that my mixture has four components is hard-coded
> into this function.  A better implementation of gmm_data() would
> generate data points for an arbitrary number of mixture components
> (i.e. an arbitrary number of rows in the data frame).
>
> How do I do this?  I'm sure it's simple, but I can't figure it out.
>
> Thanks.
> --
> Bill McNeillhttp://staff.washington.edu/billmcn/index.shtml
>
> ______________________________________________
> R-h... at r-project.org mailing listhttps://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guidehttp://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list