[R] Help vectorizing data generation for IRT graded response model

Mon Nov 19 17:40:51 CET 2012

Hello, 

I have code that will generate data for a 5 category IRT graded response
model.  However, the code could be improved through vectorizing.  Here is
the code below:

###Inputs
N <- 100 #Number of people taking test
n <- 10 #Number of items
nCat <- 5  #number of categories
###Generate Item parameters for 10 items
a.1 <- rlnorm(n, .25, .5)
b.1 <- matrix(0, n, (nCat - 1))
theta <- rnorm(N, mean = 0, sd =1)
###Generate threhold parameters for b.1 GRM with 10 items there are 4
thresholds for 5 categories
###Using this method is for b dist as N(0,1) since b.1[,1] mean is -.6
###and b.1[,2] to b.1[,4] are all .2
b.1[, 1] <- rnorm(n, -.6, 1)
for(j in 1:n) {
    b.1[j, 2] <- b.1[j,1] + runif(1, .5, .9)
    b.1[j, 3] <- b.1[j,2] + runif(1, .5, .9)
    b.1[j, 4] <- b.1[j,3] + runif(1, .5, .9)
}

###This code simulates participants taking a test and generates the 5
category item responses
    p <- array(0,c(N,n,nCat))
    pstar <- array(1,c(N,n,nCat))
    u <- matrix(0,N,n)
    for (i in 1:N) {
        for (j in 1:n) {
            #Draw a random number to determine categories
            r <- runif(1, 0, 1)
            for (k in 2:nCat) {
                pstar[i, j, k] <- 1 / (1 + exp(-a.1[j] * (theta[i] - b.1[j,
(k-1)])))
                p[i,j,(k-1)] <- pstar[i, j, (k-1)] - pstar[i, j, k]
            }
            p[i, j, nCat] <- pstar[i, j, 5] #probability of last category or
higher is that category

            if (r <= p[i, j, 1]) {
                u[i,j] <- 1
            } else
                if (r <= p[i,j,1] + p[i,j,2]) {
                    u[i,j] <- 2   
                } else
                    if (r <= p[i,j,1] + p[i,j,2] + p[i,j,3]) {
                        u[i,j] <- 3
                    } else
                        if (r <= p[i,j,1] + p[i,j,2] + p[i,j,3] + p[i,j,4])
{
                            u[i,j] <-4
                        } else
                            if (r <= 1) {
                                u[i,j] <-5
                            }
        }
    }

Obviously, that is really long and hairy.  I am wondering what would be the
best way to write this more compactly.  In conjunction with a colleague, I
have a solution for a binary IRT model with response categories 0,1.

Here is that code:
N <- 100
n <- 10
a.1 <- rlnorm(n, .25, .5)
b.1 <- rnorm(n, 0, 1)
theta <- rnorm(N, 0, 1)

###Function to generate 2PL data
dichEq <- function(a.1, b.1, theta) {
    1/(1 + exp(-a.1 * (theta -b.1)))
}
probVal <- mapply(FUN = function(x,y) dichEq(x,y,theta), a.1, b.1)
u <- apply(probVal, 2, function(x) rbinom(length(x), 1, x))

Can someone provide some guidance on how to generalize this to the ordered
category case?

Thanks, 

Jared

--
View this message in context: http://r.789695.n4.nabble.com/Help-vectorizing-data-generation-for-IRT-graded-response-model-tp4650062.html
Sent from the R help mailing list archive at Nabble.com.