[R-sig-ME] correlation of two tests treating items of both tests as random effects?

Thu Nov 5 23:57:32 CET 2015

Hi Jon,

It's an interesting problem. I just put up a github gist where I write a function to simulate data with the structure you describe, then fit what I think are the appropriate models (estimating OR ignoring random item variance) with lmer() and do the model comparisons:

https://gist.github.com/jake-westfall/3b9b4aee0c980a279acb

I ran the simulation 1000x with the true correlation at 0.5, and 1000x with the true correlation at 0. The true values are recovered quite well so it makes me think the approach is reasonable. However, at least for the parameter values I tried, adding random item variance made essentially no difference to the estimates/tests of the subject-level correlation in test scores. There's a little difference but it's hardly worth mentioning.

Basically I set up the data frame so that, if there are m items on each test, then each subject has 2*m rows in the data frame, m for each test. Then the model consists of two dummy variables indicating the test (no intercept/constant term), and these dummies vary randomly across subjects and items. You'll see in the model syntax that it's a bit hackish but it seems to work. Two other things to note about the approach I used: (a) the items from both tests are counted as a single random factor, although their variances are allowed to be different for each test; (b) the residual variance is constrained to be equal for observations from both tests, which is just an lmer() thing. In my sim I set parameter values as if the two tests are two different IQ tests, so it's fine. But this might be problematic for your actual data if the two tests are really different. You may need to scale items/observations before fitting the model or something.

Happy to hear comments from anyone else who read this far.

Jake

> Date: Thu, 5 Nov 2015 10:26:40 -0500
> From: baron at psych.upenn.edu
> To: r-sig-mixed-models at r-project.org
> Subject: [R-sig-ME] correlation of two tests treating items of both tests as	random effects?
> 
> I thought I had a solution to this problem, but I don't. The problem
> is very simple to state. It is to find whether one test correlates
> with another, when each test has several items sampled from a larger
> population of potential items.
> 
> I give two psychological tests to a group of subjects. Each test can
> be seen as a sample of items from a population. Vocabulary tests and
> arithmetic problems are examples.[1] Usually researchers just get a
> total score on each test and look at the Pearson correlation. And
> usually this is fine because the correlation is high enough that its
> existence is not in doubt, and the magnitude of the correlation is of
> primary interest.
> 
> But sometimes some theoretical question hinges on whether the tests
> correlate at all. They could correlate spuriously because of the
> particular sample of items used in each test. So one way to handle
> this is to think of items as random effects.
> 
> It is easy to do this with lmer() when ONE of the two tests is treated
> as a random effect. Each observation is the subject's score on one
> item of that test (test 1), and the summary score of the other test
> (test 2) is the predictor. The model has crossed random effects for
> subjects and test 1 items. The number of rows in the data frame is
> (number of subjects) times (number of items in test 1).
> 
> I thought it might be possible to extend this idea by making each row
> consist of a subject's score on one item of test 1 and her score on
> one item of test 2. The total number of rows would be (number of
> subjects) times (number of items in test 1) times (number of items in
> test 2). And I would include crossed random effects for subjects, test
> 1 items, and test 2 items. But then what? Do I just predict one test
> from the other, as before? (The direction may matter, but that is the
> least of my worries.)
> 
> I'm stuck. And this may be a blind alley.
> 
> Jon
> 
> Note:
> 
> [1] Not all psychological tests are like this. Some are designed to
> represent a balance between different items so that only the test as a
> whole, not each item, measures the trait of interest correctly.
> 
> -- 
> Jonathan Baron, Professor of Psychology, University of Pennsylvania
> Home page: http://www.sas.upenn.edu/~baron
> 
> _______________________________________________
> R-sig-mixed-models at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models

	[[alternative HTML version deleted]]