[R-sig-ME] difference combined (interaction) and separate models

Sun Nov 27 10:02:17 CET 2022

Dear list,

With the British Household Panel Data from 1996-2004  I'm using a model to
estimate probabilities of transitions from voting non-labour in a given
year to voting labour in the next year (entry), and vice versa from labour
to non-labour (exit). Only cases with complete data during this period are
used, for demonstrative purposes.

There are two groups of cases, (A) those who did NOT vote labour in the
previous year, and (B) those who DID vote labour in the previous year. Both
groups have nearly 11.000 cases with 8% and 94% voting labour in the
current year, respectively.

For both groups separately, I ran a logistic model with glmer, the
dependent being whether or not a person would vote labour in the current
year. There are two predictors: a dummy for race, nonwhite vs. white, and
"feeling of economic deprivation", values 0 through 4, higher means "more
deprivation". A random intercept was used over person id "pid". The
commands used are:

mA <- glmer(labourvote ~ nonwhite+deprivation + (1|pid),

            family=binomial, groupA, nAGQ=20)

mB <- glmer(labourvote ~ nonwhite+deprivation + (1|pid),

            family=binomial, groupB, nAGQ=20)

The first model estimates "entry into labour"; the second model estimates
"stay in labour" meaning that for "exit from labour" the signs of the
fixed-effect estimates should be reversed.

AGQ was used because there is a clear difference between the estimates by
the default Laplace estimation and the AGQ method: for group A
"deprivation" was insignificant (p=0.55) with Laplace, whereas significant
(p=0.03) with for AGQ. I also use mixed_model from package GLMMadaptive,
and the results are close to those of glmer, both nAGQ=20.

Next, I estimated a single model for the combined data of both groups
"AplusB" and hoped to find the same results as for the single groups above,
by using interaction of the predictors with previous year vote. This was
done to show the advantage of such an interaction model, enabling to test
the difference between the predictor effects in both groups. However, I
eliminated the intercept from fixed and random parts, to show that one
obtains the same results as for the single groups. I used mixed_models of
GLMMadaptive, because it enables AGQ for this model, which glmer does not.
I created two dummy variables "previouslab" and "previousnonlab",
indicating if the previous year vote was lab(our) or nonlab(our). With no
random person effect and using "glm" the results of the interaction model
were indeed identical to those of the two groups apart. Also, if I use
lmer, with a random person effect, as if it was a linear model, the "apart"
results are almost equal to the "combined" results. However, with a random
person effect added in logistic, the results are not identical. The
following command was use for the interaction model:

fm <- mixed_model(labourvote ~ -1 + previouslab + previousnonlab +

                         deprivation:previouslab +

                         deprivation:previousnonlab +

                         nonwhite:previouslab +

                         nonwhite:previousnonlab,

                  random = ~ -1 + previouslab + previousnonlab| pid,
nAGQ=20,

                  data = AplusB, family = binomial)

Below are the mixed_model results for the two groups apart and for the
combined data. The results are not "shockingly" different but also not what
you would call "close". My question is: why? I tried nAGQ=40 but the
estimates are highly similar to those of nAGQ=20, and thus different from
the "apart" estimates.

*previousnonlab group apart:*
              StdDev
(Intercept) 3.110278

Fixed effects:
            Estimate Std.Err  z-value   p-value
(Intercept)  -3.8789  0.1414 -27.4310   < 1e-04
nonwhite      2.3097  0.7747   2.9816 0.0028678
deprivation   0.1531  0.0673   2.2754 0.0228835

*previouslab group apart:*
              StdDev
(Intercept) 2.898665

Fixed effects:
            Estimate Std.Err z-value  p-value
(Intercept)   4.0641  0.1657 24.5267  < 1e-04
nonwhite      1.0066  0.6002  1.6772 0.093511
deprivation  -0.0057  0.0676 -0.0851 0.932175

*Both groups simultaneously*

Random effects covariance matrix:

                StdDev    Corr

previouslab     2.8452

previousnonlab  2.7811 -0.7098

Fixed effects:

                           Estimate Std.Err  z-value   p-value

previousnonlab              -4.3185  0.1573 -27.4614   < 1e-04

previousnonlab:nonwhite      1.7022  0.6803   2.5021 0.0123447

previousnonlab:deprivation   0.1761  0.0631   2.7906 0.0052605

previouslab                  4.7951  0.2046  23.4310   < 1e-04

previouslab:nonwhite         0.7012  0.5795   1.2100 0.2262629

previouslab:deprivation     -0.0267  0.0656  -0.4070 0.6840386

Thanks for any explanation of where these differences may come from!

	[[alternative HTML version deleted]]