[R-sig-eco] Pred function - miss understanding?

Chris Mcowen chrismcowen at gmail.com
Thu Aug 26 22:38:20 CEST 2010


Hi Ben and list

It predicted 756 out of 1115 correct, with another 150 very close. So i guess that is not too bad (68%) but should it be higher than that based on the fact that species i am testing the effectiveness with were the ones that built the model?

One interesting point is that it got the cutoffs correct even if they were opposite, for example - 

SPECIES			1	2	3	4	5	6	7	8	9	10
REAL 				Y	Y	Y	Y	Y	N	N	N	N	N
PREDICTED			N	N	N	N	N	Y	Y	Y	Y	Y

This happened very often?

I have different levels of threat so maybe that will allow predictions to a finer scale, would the same method i have used below if i ran a GLM with gaussian rather than binomial and then used the predict function?

Thanks for your help,

Chris


On 26 Aug 2010, at 20:37, Ben Bolker wrote:

-----BEGIN PGP SIGNED MESSAGE-----
Hash: SHA1

On 10-08-26 12:39 PM, Chris Mcowen wrote:
> Dear List,
> 
> I am trying to predict the extinction risk of a species based on its
life history. I will detail my method below and would welcome comments
as to why the results are not as i expected.
> 
> 
> First i fit my model -
> 
>> model1 <- glm(THREAT~ HAB*BS + FR + WO + SEA + PD, data=traits,
family="binomial")
> 
>            Where THREAT is TRUE (1) / FALSE (0).
> 
>            Where BS, FR etc are factors with multiple levels.
> 
> 
> I then predicted the probability of a species being threatened or not using
> 
>> print(predict(model1, type = "response"))
> 
> example output:-
> 
>       1              2                  3                 
4                  5                  6                  7
> 0.44659200 0.65221495 0.71357243 0.71357243 0.71357243 0.71357243
0.71357243
>       8              9                 10                
11                 12                 13                 14
> 0.71357243 0.65221495 0.65221495 0.65221495 0.65221495 0.65221495
0.65221495
> 
> I interpret this as species 1 has a 45% chance (probability) of being
threatened etc....
> 
> I then wanted to see how this relates to the "true" threat level so i
looked at species 1 and it was classed as threatened, which disagrees
with the predict results, although marginally. In fact most of the
predict results do not agree with the "real" threat level, some species
have a probability of 0.17 which to me says they are non threatened but
in "real" they are classed as threatened.
> 
> This is important as if these are not matching, at least most of the
time, then how can i confidently predict the response of a species when
i don't know its "real" response?

 How bad is the mismatch?  With a probability of 0.44 you don't have
much information either way -- not surprising if the species is listed
as threatened *or* not threatened.   It's hard to say without more
detail: if it's really true that
making predictions by rounding up or down (i.e. pred prob >0.5 -> 1)
gives you more misses than hits, then something
sounds screwy.  You shouldn't do worse than 50% correct guessing at
random ...

  (I note that many of the entries are giving identical probabilities
- -- these points have identical sets of predictors,
presumably)

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.10 (GNU/Linux)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org/

iEYEARECAAYFAkx2wmkACgkQc5UpGjwzenOHbwCdHgpF9M7vwl+AVLtr6GKrDn4a
mmUAnREIgM6MNYcT+6BiBuzL0kx0WH0Q
=PgOS
-----END PGP SIGNATURE-----



More information about the R-sig-ecology mailing list