[R-sig-ME] Modeling attacks and victories

Thu Apr 20 23:42:15 CEST 2017

Thanks very much for the advice. I truly appreciate it.

I am going to steer this one back to a simpler question, predict the
number of attacks and put aside the success issue until after we
understand the basics.  I think we can manage the random effects in
the count model, maybe with zero inflation.

pj

On Thu, Apr 20, 2017 at 2:34 AM, Thierry Onkelinx
<thierry.onkelinx at inbo.be> wrote:
> Dear Paul,
>
> I'd focus on two different points first:
> a) what does the student wants to model: the probability of success? the
> number of events? the number of successful events? something else?
> b) what are the statistical skills of the student. That will determine the
> appropriate level of statistical machismo.
>
> Best regards,
>
> Thierry
>
>
> ir. Thierry Onkelinx
> Instituut voor natuur- en bosonderzoek / Research Institute for Nature and
> Forest
> team Biometrie & Kwaliteitszorg / team Biometrics & Quality Assurance
> Kliniekstraat 25
> 1070 Anderlecht
> Belgium
>
> To call in the statistician after the experiment is done may be no more than
> asking him to perform a post-mortem examination: he may be able to say what
> the experiment died of. ~ Sir Ronald Aylmer Fisher
> The plural of anecdote is not data. ~ Roger Brinner
> The combination of some data and an aching desire for an answer does not
> ensure that a reasonable answer can be extracted from a given body of data.
> ~ John Tukey
>
> 2017-04-19 17:39 GMT+02:00 Paul Johnson <pauljohn32 at gmail.com>:
>>
>> Could I ask for pointers on how to guide a student in my multilevel
>> modeling course?
>>
>> The outcome data is terrorist attack events, with one row per event
>> (events are listed by country and year). The data also indicates if
>> each attach is a "success" (I have no idea how that's measured, if it
>> matters I can find out).
>>
>> The student says that, in his field, what they would do is aggregate
>> events at the country/year level to create a "proportion of successful
>> attacks" variable. If a country has no events, then it is scored as a
>> 0.  Then they'd run random intercept models using country as case
>> identifier, possibly with other country level predictors that vary
>> across time.
>>
>> I think we can do better than that. The number of events within
>> countries varies widely, some have 0 or 1 attack, while in some years
>> there are 30 or more.  Measuring the proportions is, obviously,
>> sensitive to the number in the denominator.  Some countries are scored
>> on a scale 0, .5, 1, while others are scored as 0, 0.03, 0.06, and so
>> forth.  Other obvious problems are the presence of 0's.
>>
>> My first idea was to made this a binomial glm and predict successes as
>> a proportion of attacks.  That's a problem because there are lots of 0
>> attack country/years, but also because I'm
>>
>> It looks to me like we need to explore this as a two part model, where
>> part 1 predicts (attacks > 0) and part 2 is binomial among the
>> countries and places where attacks > 0. I'm not finding discussion of
>> this particular example while searching (I probably don't know the
>> magic words).  However, we need to insert the country-level intercept
>> in both models, and perhaps the country effect needs to be correlated
>> between the two models.
>>
>> pj
>>
>> --
>> Paul E. Johnson   http://pj.freefaculty.org
>> Director, Center for Research Methods and Data Analysis
>> http://crmda.ku.edu
>>
>> To write to me directly, please address me at pauljohn at ku.edu.
>>
>> _______________________________________________
>> R-sig-mixed-models at r-project.org mailing list
>> https://stat.ethz.ch/mailman/listinfo/r-sig-mixed-models
>
>

-- 
Paul E. Johnson   http://pj.freefaculty.org
Director, Center for Research Methods and Data Analysis http://crmda.ku.edu

To write to me directly, please address me at pauljohn at ku.edu.