Dear List Servers,
I have some question regarding random effects in linear mixed models.
My data is a large survey (N=82,000) conducted between Jan - Nov 2004 to
measure individuals' political preferences. I want to explore the effect
of TV political ads in shaping those preferences especially among
those citizens who consider themselves as independent in terms of political
predispositions. Most TV political ads are concentrated on those states
where both candidates (Bush & Kerry) believed they had a chance to win
(battleground states).
Dependent variable: Bush's Favorability measured at individual level
(ranges from 0 - to 10)
Key independent variable : Bush.ads (measured at individual level based on
the numbers of ads aired in respondent's state during the last week
previous to the day of interview)
These are the effects that I want to test:
1. Were Bush's ads effective in increasing his favorability?
2. Were Bush's ads effective in increasing his favorability among
independents?
3. Did the effect of Bush's ads change before and after the democratic
convention?
4. In which states and on what weeks did Bush's ads have a significant
impact on his favorability (in general and among independents)?
5. Did the number of ads aired matter in increasing Bush's favorability?
6. How to incorporate the analysis of Kerry's campaign?
This is the model I am running to test those effects:
campaign.model <- lmer (bush.fav ~ Independent + Bush.ads +
Independent:Bush.ads (1+ Independent + Bush.ads
+ Independent:Bush.ads |State) ))
These are my strategies to answer above questions and my doubts about them:
1. Answered based on the fixed coefficient of Bush.ads
2. Answered based on the fixed coefficient of the
interaction Independent:Bush.ads
3. Answered by splitting the dataset in before-after convention, running
the same model, and comparing the coefficient for Bush.ads in both datasets
4. Answered by running separated models for each week and retrieving
random effects by state to find substantive and statistically significant
results
5. This one I'm not sure. The initial analysis uses dummy variables for
Bush'ads . Bush.ads; however, (in its original metric)
is always positive and ranges from 0 to 1187. Zero has a substantive
meaning in terms of the campaigns' strategies that's why I'm not sure of
using a logged measure. At the same time I have read in this list that it
is not recommendable to include categorical predictors in the fixed and the
random part to avoid overfitting problems. Should I model the effects of
Bush.ads (binary) as a function of the number of Bush's ads (original
scale)? What are the alternatives?
6. Not sure again. Bush.ads and Kerry.ads are strongly correlated (0.9)
but this varies across states and weeks. I'm not sure whether
is preferable to include Kerry's variables and their interactions in the
same model (potential overfitting problems again) or run separate analysis
for each campaign
I'd really appreciate any feedback and guidance that anyone can provide me.
Thanks,
David Llanos
[[alternative HTML version deleted]]