[R-sig-ME] Analyzing similarity scores between subjects

Jon Baron b@ron @ending from upenn@edu
Wed Aug 8 21:59:00 CEST 2018

This is a tough problem! And I'm not sure I can solve it without the
data (and I am not willing to go that far), or ever. But here are some
thoughts. If I had these data, I would not automatically think of
using a multi-level model, but, the more I think about it, the more
sense it makes. (And I would first look at something really simple to
see if my hypothesis has a chance of being correct.)

First, 4 time points may not be enough to treat time as a random
effect. It might (or might not) make sense to treat time as a fixed
effect and look at its interaction with type. It may be that time 
segment does not matter at all. But if there is an interaction you
need to worry about coding the variables so that you can still
interpret the main effect of type.

Second, it seems to me that you need random-effect terms for both
subjects in each pair. And you should use only unique pairs, so that
you do not double-count (as you realize).

Thus, the model I would think of would be something like:

lmer(dist_ij ~ type_ij*segment + (1|sub_i) + (1|sub_j))

I'm not sure about which random slopes to include, if any, but with
all of them it would be something like:

lmer(dist_ij ~ type_ij*segment + (1+type*segment|sub_i) + (1+type*segment|sub_j))

Maybe you don't need the 1 in the last grouping term.

I'm just using the "ij" notation to indicate that you have a matrix or
data frame with one row for each unique pair in each segment.

I'm not sure whether "segment" should be a number or a factor.


On 08/08/18 13:22, Han Zhang wrote:
>Hi all,
>I have a modeling problem involving similarity scores between subjects.
>During 4 time points in my experiment, I sampled eye movements of my
>subjects. At each time point, subjects had either one of two different
>states, Y or N. I have no control of the state, it is purely observational.
>My data produces 4 similarity matrices - for each sampling, every subject
>was compared to every other subject on some similarity measure of eye
>movements (self-comparisons excluded). Each matrix contains three types of
>comparison: N-N, N-Y, and Y-Y. My hypothesis is that the eye movements of
>those in state N were more similar to each other, compared to N-Y, or Y-Y.
>So N-N > N-Y or Y-Y.
>I came up with a model like this:
>lmer(dist ~ type + (1|sub_i) + (1|sub_i:type) + (1|segment) +
>(1|segment:type) + (1|sub_i: segment) + (1|sub_i: segment:type), data,
>where dist is the similarity score, type is a 3-level factor (n-n, n-y,
>y-y), sub_i is subject ID, segment is sample ID. I was
>trying to build a model with a "maximal" random structure.
>Have I correctly specified my model? I have two concerns:
>(1) because any given data point in the matrix belongs to two subjects, i
>and j, should I include random effects for both subject i and subject j?
>(2) Becuase each matrix is symmetrical, I am duplicating my data in the
>above model. Should I use only the unique pairwise comparisons and do
>something like this:
>lmer(dist ~ type + (1|segment) + (1|segment:type), half_data, REML=F)
>Han Zhang
>Graduate Student
>Combined Program in Education and Psychology
>University of Michigan, Ann Arbor
>Email: hanzh using umich.edu
>Phone: 1-734-680-6031
>	[[alternative HTML version deleted]]
>R-sig-mixed-models using r-project.org mailing list

Jonathan Baron, Professor of Psychology, University of Pennsylvania
Home page: http://www.sas.upenn.edu/~baron
Editor: Judgment and Decision Making (http://journal.sjdm.org)

More information about the R-sig-mixed-models mailing list