[R] Generating input population for microsimulation
Jan van der Laan
rhelp at eoos.dds.nl
Wed Dec 14 12:18:36 CET 2011
Emma,
If, as you say, each unit is the same you can just repeat the units to
obtain the required number of units. For example,
unit_size <- 10
n_units <- 10
unit_id <- rep(1:n_units, each=unit_size)
pid <- rep(1:unit_size, n_units)
senior <- ifelse(pid <= 2, 1, 0)
pop <- data.frame(unit_id, pid, senior)
If you want more flexibility in generating the units, I would first
generate the units (without the persons) and then generate the persons
for each unit. In the example below I use the plyr package; you could
probably also use lapply/sapply, or simply a loop over the units.
library(plyr)
generate_unit <- function(unit) {
pid <- 1:unit$size
senior <- rep(0, unit$size)
senior[sample(unit$size, 2)] <- 1
return(data.frame(unit_id=unit$id, pid=pid, senior=senior))
}
units <- data.frame(id=1:n_units, size=unit_size)
library(plyr)
ddply(units, .(id), generate_unit)
HTH,
Jan
Emma Thomas <thomas_ek at yahoo.com> schreef:
> Hi all,
>
> I've been struggling with some code and was wondering if you all could help.
>
> I am trying to generate a theoretical population of P people who are
> housed within X different units. Each unit follows the same
> structure- 10 people per unit, 8 of whom are junior and two of whom
> are senior. I'd like to create a unit ID and a unique identifier for
> each person (person ID, PID) in the population so that I have a
> matrix that looks like:
>
> unit_id pid senior
> [1,] 1 1 0
> [2,] 1 2 0
> [3,] 1 3 0
> [4,] 1 4 0
> [5,] 1 5 0
> [6,] 1 6 0
> [7,] 1 7 0
> [8,] 1 8 0
> [9,] 1 9 1
> [10,] 1 10 1
> ...
>
> I came up with the following code, but am having some trouble
> getting it to populate my matrix the way I'd like.
>
> world <- function(units, pop_size, unit_size){
> pid <- rep(0,pop_size) #person ID
> senior <- rep(0,pop_size) #senior in charge
> unit_id <- rep(0,pop_size) #unit ID
>
> for (i in 1:pop_size){
> for (f in 1:units) {
> senior[i] = sample(c(1,1,0,0,0,0,0,0,0,0), 1, replace = FALSE)
> pid[i] = sample(c(1:10), 1, replace = FALSE)
> unit_id[i] <- f
> }}
> data <- cbind(unit_id, pid, senior)
>
> return(data)
> }
>
> world(units = 10,pop_size = 100, unit_size = 10) #call the function
>
>
>
> The output looks like:
> unit_id pid senior
> [1,] 10 7 0
> [2,] 10 4 0
> [3,] 10 10 0
> [4,] 10 9 1
> [5,] 10 10 0
> [6,] 10 1 1
> ...
>
> but what I really want is to generate is 10 different units with two
> seniors per unit, and with each person in the population having a
> unique identifier.
>
> I thought a nested for loop was one way to go about creating my data
> set of people and families, but obviously I'm doing something (or
> many things) wrong. Any suggestions on how to fix this? I had been
> focusing on creating a person and assigning them to a unit, but
> perhaps I should create the units and then populate the units with
> people?
>
> Thanks so much in advance.
>
> Emma
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list