[R-sig-ME] IRT: discrimination parameters from LMER?

Mon Nov 8 18:39:22 CET 2010

I can't speak to the error below, perhaps the package author/maintainer can do so. I'm not sure I follow this last thread exactly. But, here is what you should be seeing, if not I too think there is an issue.

1) When you use lmer to fit the rasch model, you get b-parameters for the item (difficulties) and a population distribution that is N(0, s^2)
2) When you use the rasch function in ltm and the discrimination is not set at 1, then the items have a common slope. This common slope should be equivalent to s (the standard deviation from lmer). The parameter estimates will differ because they are identified differently. I think if you want to compare the difficulties from ltm to lmer then you need b/s where b is the difficulty of the item.

Dimitris, does rasch use D = 1 or 1.7 when the discrimination is not 1?

> -----Original Message-----
> From: Paul Johnson [mailto:pauljohn32 at gmail.com]
> Sent: Sunday, November 07, 2010 1:00 AM
> To: Doran, Harold
> Cc: R-SIG-Mixed-Models at r-project.org
> Subject: Re: [R-sig-ME] IRT: discrimination parameters from LMER?
> 
> Dear Harold.
> 
> Thanks for the example.  I've not heard of MiscPsycho before, that's
> interesting.
> 
> Your test case focuses on difficulty parameters, and that's not where
> I'm seeing trouble.  I agree with you difficulty estimates match the
> fixed effects from lmer.
> 
> The discrimination estimate from ltm compared against lmer's random
> effect estimate is my concern.  If the discrimination parameter is not
> 1.0, then I wonder if your argument holds up.  In that one simulation
> I reported last time, I saw pretty big difference.
> 
> I'm adapting your simulation to do a lot of data sets.
> 
> But I can't amaze you with my result now because of this weird kernel
> error I've never seen before, so I think I need to close everything
> down. But I'll get back to you.
> 
> 
> >   fm1 <- rasch(itemDat[, -21], IRT.param=F)
> OMP: Error #15: Initializing libguide.so, but found libguide.so
> already initialized.
> OMP: Hint: This may cause performance degradation and correctness
> issues. Set environment variable KMP_DUPLICATE_LIB_OK=TRUE to ignore
> this problem and force the program to continue anyway. Please note
> that the use of KMP_DUPLICATE_LIB_OK is unsupported and using it may
> cause undefined behavior. For more information, please contact
> Intel(R) Premier Support.
> 
> 
> Weird, huh?
> 
> pj
> 
> 
> > I don't think this is true necessarily. I've done a small experiment all
> reproducible code is at bottom of email. I simulate some rasch data and then
> estimate item parameters using ltm, lmer, jml (in MiscPsycho), and mml (a
> function I wrote that is below using the traditional mml methods). To make a
> long story short, I compute the bias under each model.
> >
> >> ### ltm bias
> >> mean((coef(fm1)[,1] - mean(coef(fm1)[,1])) - trueVals)
> > [1] -0.173386111948639
> >>
> >> ### lmer bias
> >> mean((fixef(fm2) - mean(fixef(fm2))) - trueVals)
> > [1] -0.173386111948639
> >>
> >> ### jml bias
> >> mean(coef(fm1) - trueVals)
> > [1] 0.39007242168395
> >>
> >> ### mml bias
> >> mean((coef(fm4)[1:20] - mean(coef(fm4)[1:20])) - trueVals)
> > [1] -0.173386111948639
> >
> > It seems to me that jml (as expected) performs the worst, but all other
> functions are comparable and seem to recover parameter estimates rather well.
> If you run the mml function below, it is slow. But, notice that the standard
> deviation in mml is exactly the same as what lmer gives. That is because a
> multilevel rasch model with the parameterization I have used (items as fixed
> respondents as random) is exactly the same as mml. But, lmer's capacity can go
> much, much further. Had I not constrained the ltm function to have a
> discrimination of 1, the common discrimination parameter would also be the sd
> given by mml and lmer.
> >
> >> Do you care to weigh in on the question of whether lmer could ever be
> >> made to fit a 2PL or 3PL model?  I mean, could I write a family for
> >> lmer that would do that, or is the calculation required too grossly
> >> different?
> >
> > Yes, I'm sure it can. The issue is who will write the code. Doug Bates is
> extremely generous with his help and his openness to others making
> contributions to lme4. But, (as he might do so on this thread), he will
> forewarn you that the underlying code in lmer is very complex and doesn't
> resemble methods commonly used in other packages. So, because Doug is so
> amazingly smart, and I am just a mortal, I have decided to steer away from
> contributing to lmer other than being a "toy tester" for Doug.
> >
> > I am not a fan of the 3PL for many reasons. First, it is a model that cannot
> be identified without putting a very slim prior on the guessing parameter. In
> the end, because the prior is so slim, guess what your parameter estimates are
> close to? Yes, the prior. So, I just have never been made comfortable with
> that.
> >
> >> If 2PL or 3PL is really a whole different structure, I don't mind
> >> using "ltm."  But defeats the ideal that you proposed in that 2007 JSS
> >> article (which I really like!), that we might be better off fitting
> >> specialized models with general purpose software.
> >
> > I think ltm is a really great package. It may have some capacity to do what
> you want, I don't fully know its capacities. Perhaps Dimitris can offer some
> thought.
> >
> >> Here's a similar example.  Sometimes I have wanted to fit ordinal
> >> dependent variables, and have learned that those models do not fall
> >> into the family of GLM and a tool different from lmer is necessary to
> >> work on them.   I wish it weren't so, but have accepted reality.
> >>
> >> pj
> >
> > This is very true. I have been doing some work on polytomous models, but am
> working very slowly on functions to estimate them. There are packages in R
> that can currently do this. I wonder if MCMCglmm can do it; I don't know.
> >
> >
> > library(ltm)
> > library(lmer)
> > library(MiscPsycho)
> >
> > Nitems <- 20
> > Nperson <- 500
> >
> > set.seed(12345)
> > dat <- simRasch(Nperson, Nitems)
> > itemDat <- dat$data
> >
> > ### Fit data using rasch in LTM
> > fm1 <- rasch(itemDat, constraint = cbind(ncol(itemDat) + 1, 1))
> >
> > ### Fit using lmer
> > itemDat$id <- 1:nrow(itemDat)
> > ### reshape into long format
> > testScores <- reshape(itemDat, idvar='id',
> varying=list(names(itemDat)[1:Nitems]), v.names=c('answer'),
> timevar='question', direction='long')
> >
> > # Treat items as fixed but students as random
> > fm2 <- lmer(answer ~ factor(question) - 1 + (1|id), testScores, family =
> binomial(link='logit'))
> >
> >
> > ### Fit using JML
> > fm3 <- jml(dat$data, bias=TRUE)
> >
> > ### Fit using mml function below
> > fm4 <- mml(dat$data, Q =10, startVal = coef(fm1)[,1])
> >
> > ### Compute bias
> > trueVals <- dat$gen
> >
> > ### ltm bias
> > mean((coef(fm1)[,1] - mean(coef(fm1)[,1])) - trueVals)
> >
> > ### lmer bias
> > mean((fixef(fm2) - mean(fixef(fm2))) - trueVals)
> >
> > ### jml bias
> > mean(coef(fm1) - trueVals)
> >
> > ### mml bias
> > mean((coef(fm4)[1:20] - mean(coef(fm4)[1:20])) - trueVals)
> >
> >
> >
> > mml <- function(data, Q, startVal = NULL, ...){
> >        if(!is.null(startVal) && length(startVal) != ncol(data) ){
> >                stop("Length of argument startVal not equal to the number of
> parameters estimated")
> >        }
> >        if(!is.null(startVal)){
> >                startVal <- startVal
> >                } else {
> >                p <- colMeans(data)
> >                startVal <- as.vector(log((1 - p)/p))
> >        }
> >        qq <- gauss.quad.prob(Q, dist = 'normal', mu = 0, sigma=1)
> >        rr1 <- matrix(0, nrow = Q, ncol = nrow(data))
> >        data <- as.matrix(data)
> >        L <- nrow(data)
> >        C <- ncol(data)
> >        u <- 0
> >        fn <- function(b){
> >        s <- b[C+1]
> >        b <- b[1:C]
> >                for(j in 1:Q){
> >                        for(i in 1:L){
> >                                rr1[j,i] <- exp(sum(dbinom(data[i,], 1,
> plogis(qq$nodes[j]-b), log = TRUE))) *
> >                                ((1/(s*sqrt(2*pi)))  * exp(-((qq$nodes[j]-
> u)^2/(2*s^2))))/dnorm(qq$nodes[j]) * qq$weights[j]
> >                        }
> >                }
> >        -sum(log(colSums(rr1)))
> >        }
> >        opt <- optim(c(startVal,1), fn, method = "BFGS", hessian = TRUE)
> >        list("coefficients" = opt$par, "LogLik" = -opt$value, "Std.Error" =
> sqrt(diag(solve(opt$hessian))))
> > }
> >
> 
> 
> 
> --
> Paul E. Johnson
> Professor, Political Science
> 1541 Lilac Lane, Room 504
> University of Kansas