[R] Generating input population for microsimulation
Emma Thomas
thomas_ek at yahoo.com
Wed Dec 14 18:23:44 CET 2011
Dear Jan,
Thanks for your reply.
The first solution works well for my needs for now, but I have a question about the second. If I run your code and then call the function:
generate_unit(10)
I get an error that
Error in unit$size : $ operator is invalid for atomic vectors
Did you experience the same thing?
In any case, I will definitely take a look at the plyr package, which I'm sure will be useful in the future.
Thanks again!
Emma
----- Original Message -----
From: Jan van der Laan <rhelp at eoos.dds.nl>
To: "r-help at r-project.org" <r-help at r-project.org>
Cc: Emma Thomas <thomas_ek at yahoo.com>
Sent: Wednesday, December 14, 2011 6:18 AM
Subject: Re: [R] Generating input population for microsimulation
Emma,
If, as you say, each unit is the same you can just repeat the units to obtain the required number of units. For example,
unit_size <- 10
n_units <- 10
unit_id <- rep(1:n_units, each=unit_size)
pid <- rep(1:unit_size, n_units)
senior <- ifelse(pid <= 2, 1, 0)
pop <- data.frame(unit_id, pid, senior)
If you want more flexibility in generating the units, I would first generate the units (without the persons) and then generate the persons for each unit. In the example below I use the plyr package; you could probably also use lapply/sapply, or simply a loop over the units.
library(plyr)
generate_unit <- function(unit) {
pid <- 1:unit$size
senior <- rep(0, unit$size)
senior[sample(unit$size, 2)] <- 1
return(data.frame(unit_id=unit$id, pid=pid, senior=senior))
}
units <- data.frame(id=1:n_units, size=unit_size)
library(plyr)
ddply(units, .(id), generate_unit)
HTH,
Jan
Emma Thomas <thomas_ek at yahoo.com> schreef:
> Hi all,
>
> I've been struggling with some code and was wondering if you all could help.
>
> I am trying to generate a theoretical population of P people who are housed within X different units. Each unit follows the same structure- 10 people per unit, 8 of whom are junior and two of whom are senior. I'd like to create a unit ID and a unique identifier for each person (person ID, PID) in the population so that I have a matrix that looks like:
>
> unit_id pid senior
> [1,] 1 1 0
> [2,] 1 2 0
> [3,] 1 3 0
> [4,] 1 4 0
> [5,] 1 5 0
> [6,] 1 6 0
> [7,] 1 7 0
> [8,] 1 8 0
> [9,] 1 9 1
> [10,] 1 10 1
> ...
>
> I came up with the following code, but am having some trouble getting it to populate my matrix the way I'd like.
>
> world <- function(units, pop_size, unit_size){
> pid <- rep(0,pop_size) #person ID
> senior <- rep(0,pop_size) #senior in charge
> unit_id <- rep(0,pop_size) #unit ID
>
> for (i in 1:pop_size){
> for (f in 1:units) {
> senior[i] = sample(c(1,1,0,0,0,0,0,0,0,0), 1, replace = FALSE)
> pid[i] = sample(c(1:10), 1, replace = FALSE)
> unit_id[i] <- f
> }}
> data <- cbind(unit_id, pid, senior)
>
> return(data)
> }
>
> world(units = 10,pop_size = 100, unit_size = 10) #call the function
>
>
>
> The output looks like:
> unit_id pid senior
> [1,] 10 7 0
> [2,] 10 4 0
> [3,] 10 10 0
> [4,] 10 9 1
> [5,] 10 10 0
> [6,] 10 1 1
> ...
>
> but what I really want is to generate is 10 different units with two seniors per unit, and with each person in the population having a unique identifier.
>
> I thought a nested for loop was one way to go about creating my data set of people and families, but obviously I'm doing something (or many things) wrong. Any suggestions on how to fix this? I had been focusing on creating a person and assigning them to a unit, but perhaps I should create the units and then populate the units with people?
>
> Thanks so much in advance.
>
> Emma
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help
mailing list