[R-sig-ME] Modeling attacks and victories
pauljohn32 at gmail.com
Thu Apr 20 23:42:15 CEST 2017
Thanks very much for the advice. I truly appreciate it.
I am going to steer this one back to a simpler question, predict the
number of attacks and put aside the success issue until after we
understand the basics. I think we can manage the random effects in
the count model, maybe with zero inflation.
On Thu, Apr 20, 2017 at 2:34 AM, Thierry Onkelinx
<thierry.onkelinx at inbo.be> wrote:
> Dear Paul,
> I'd focus on two different points first:
> a) what does the student wants to model: the probability of success? the
> number of events? the number of successful events? something else?
> b) what are the statistical skills of the student. That will determine the
> appropriate level of statistical machismo.
> Best regards,
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> To call in the statistician after the experiment is done may be no more than
> asking him to perform a post-mortem examination: he may be able to say what
> the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
> 2017-04-19 17:39 GMT+02:00 Paul Johnson <pauljohn32 at gmail.com>:
>> Could I ask for pointers on how to guide a student in my multilevel
>> modeling course?
>> The outcome data is terrorist attack events, with one row per event
>> (events are listed by country and year). The data also indicates if
>> each attach is a "success" (I have no idea how that's measured, if it
>> matters I can find out).
>> The student says that, in his field, what they would do is aggregate
>> events at the country/year level to create a "proportion of successful
>> attacks" variable. If a country has no events, then it is scored as a
>> 0. Then they'd run random intercept models using country as case
>> identifier, possibly with other country level predictors that vary
>> across time.
>> I think we can do better than that. The number of events within
>> countries varies widely, some have 0 or 1 attack, while in some years
>> there are 30 or more. Measuring the proportions is, obviously,
>> sensitive to the number in the denominator. Some countries are scored
>> on a scale 0, .5, 1, while others are scored as 0, 0.03, 0.06, and so
>> forth. Other obvious problems are the presence of 0's.
>> My first idea was to made this a binomial glm and predict successes as
>> a proportion of attacks. That's a problem because there are lots of 0
>> attack country/years, but also because I'm
>> It looks to me like we need to explore this as a two part model, where
>> part 1 predicts (attacks > 0) and part 2 is binomial among the
>> countries and places where attacks > 0. I'm not finding discussion of
>> this particular example while searching (I probably don't know the
>> magic words). However, we need to insert the country-level intercept
>> in both models, and perhaps the country effect needs to be correlated
>> between the two models.
>> Paul E. Johnson http://pj.freefaculty.org
>> Director, Center for Research Methods and Data Analysis
>> To write to me directly, please address me at pauljohn at ku.edu.
>> R-sig-mixed-models at r-project.org mailing list
Paul E. Johnson http://pj.freefaculty.org
Director, Center for Research Methods and Data Analysis http://crmda.ku.edu
To write to me directly, please address me at pauljohn at ku.edu.
More information about the R-sig-mixed-models