[R-sig-ME] mixed model with non-continuous numeric response
Reinhold Kliegl
reinhold.kliegl at gmail.com
Mon Dec 22 17:25:04 CET 2008
The VR paragraphs I was referring to are on page 199. Anyway, if one
is willing to make the assumption of linear spacing, then responses 1,
2, 3, 4 can surely also be interpreted as count data; sort of the
number of latent pieces of evidence you need to move up one response
category; subtract 1 if you want "0" as part of the scale.
Then, indeed, the distribution or responses matters. If the
distribution looks roughly "normal" (e.g., if categories 2 and 3 are
more frequent than 1 and 4), it probably does not matter whether you
use the Gaussian or the Poisson family. If they are bi-modal, I would
definitely prefer the latter. (Of course, it does matter if you have a
substantive theory.)
Reinhold Kliegl
On Mon, Dec 22, 2008 at 4:06 PM, Jonathan Baron <baron at psych.upenn.edu> wrote:
> On 12/22/08 15:04, Reinhold Kliegl wrote:
>> See Venables and Ripley (2002, p.200) for an example modeling
>> three-levels of satisfaction (low, medium, high) as a surrogate
>> Poisson model. They also provide the technical justification. The
>> alternative is to fit it as multinomial model--not sure how, if it at
>> all, this can be done with glmer in its current implementation.
>
> Johnson (the original poster) said that the responses can be thought
> of as equally spaced points, i.e., linear with the underlying variable
> of interest. I think that this is often a reasonable assumption, so
> another alternative is to do what he said. Psychologists -- perhaps
> because we have read Dawes, R. M., & Corrigan, B. (1974). Linear
> models in decision making. Psychological Bulletin, 81, 97–106 -- are
> often willing to assume that linear models are good fits even when
> they are technically wrong.
>
> (I also couldn't find VR's rationale for the surrogate Poisson model,
> but I'm not questioning that possibility.)
>
> The question is about how serious is the violation of the assumed
> error distribution when we have only 4 categories. When I do this -
> which I admit is usually when I'm using lm() and not lmer() - I look
> at the error distributions (from the default plot()) and do an eyeball
> test. If the result is barely "significant" at the outset, I worry.
>
> Jon
>
>> Reinhold Kliegl
>>
>> On Mon, Dec 22, 2008 at 1:41 PM, Daniel Ezra Johnson
>> <danielezrajohnson at gmail.com> wrote:
>> > I don't think this is count data, is it???
>> >
>> > On Mon, Dec 22, 2008 at 12:40 PM, Reinhold Kliegl
>> > <reinhold.kliegl at gmail.com> wrote:
>> >> ( ..., family="poisson") is the most used option for count data
>> >>
>> >> Reinhold Kliegl
>> >>
>> >> On Mon, Dec 22, 2008 at 12:54 PM, Daniel Ezra Johnson
>> >> <danielezrajohnson at gmail.com> wrote:
>> >>> Dear all,
>> >>>
>> >>> I have survey results where the response is 1, 2, 3, or 4. These can
>> >>> be thought of as equally-spaced points on a scale, I don't have a
>> >>> problem with that. (They're actually more like "not at all", "some",
>> >>> "mostly", "totally"; the subject is judging a stimulus.)
>> >>>
>> >>> I want to model crossed random effects for Subject and Item. Am I way
>> >>> off base in modeling this data with a lmer(family="gaussian") model? I
>> >>> know it's not perfect, but is it really bad? If so, what could I do
>> >>> instead? (The error certainly wouldn't be binomial, right?)
>> >>>
>> >>> Thanks,
>> >>> Daniel
>> >>>
>> >>> _______________________________________________
>> >>> R-sig-mixed-models at r-project.org mailing list
>> >>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>> >>>
>> >>
>> >
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
> --
> Jonathan Baron, Professor of Psychology, University of Pennsylvania
> Home page: http://www.sas.upenn.edu/~baron
> Editor: Judgment and Decision Making (http://journal.sjdm.org)
>
More information about the R-sig-mixed-models
mailing list