[R] nontabular logistic regression
Gavin Simpson
gavin.simpson at ucl.ac.uk
Fri Oct 13 18:23:54 CEST 2006
On Fri, 2006-10-13 at 09:28 -0500, Jeffrey Stratford wrote:
> Hi. I'm attempting to fit a logistic/binomial model so I can determine
> the influence of landscape on the probability that a box gets used by a
> bird. I've looked at a few sources (MASS text, Dalgaard, Fox and
> google) and the examples are almost always based on tabular predictor
> variables. My data, however are not. I'm not sure if that is the
> source of the problems or not because the one example that includes a
> continuous predictor looks to be coded exactly the same way. Looking at
> the output, I get estimates for each case when I should get a single
> estimate for purbank. Any suggestions?
>
> Many thanks,
>
> Jeff
Hi Jeff,
using the snippet of data you provided (copy/paste into a text file and
read in with read.table) worked fine:
box.use <- read.table("~/tmp/tmp.txt", header = TRUE)
box.use
str(box.use)
'data.frame': 8 obs. of 10 variables:
$ box : int 1 2 3 4 5 6 7 8
$ use : int 1 1 1 1 0 1 1 0
$ purbank : num 0.00381 0.04429 0.04459 0.06072 0.60810 ...
$ purban2 : num 0.0268 0.1611 0.0604 0.2081 0.6980 ...
$ purban1 : num 0.069 0.172 0.000 0.069 0.690 ...
$ pgrassk : num 0.3282 0.1534 0.1628 0.0194 0.0317 ...
$ pgrass2 : num 0.685 0.383 0.557 0.000 0.128 ...
$ pgrass1 : num 0.759 0.655 0.759 0.000 0.241 ...
$ grassdist : num 0 0 0 323 30 ...
$ grasspatchk: num 3.730 1.023 0.961 0.228 0.263 ...
Now I don't like attach, and you just don't need it so I deviate a
little now. Replace box.use$use directly and make use of the data
argument in glm. Also, your data didn't have any missing data so I'm not
sure whether the response or predictor is missing and whether your
na.omit is needed or not - I don't use it below.
box.use$use <- factor(box.use$use, levels=0:1)
levels(box.use$use) <- c("unused", "used")
box.use
str(box.use)
glm1 <- glm(use ~ purbank, data = box.use, family = binomial())
summary(glm1)
Call:
glm(formula = use ~ purbank, family = binomial(), data = box.use)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.61450 -0.03098 0.31935 0.45888 1.39194
Coefficients:
Estimate Std. Error z value Pr(>|z|)
(Intercept) 3.223 2.225 1.448 0.147
purbank -6.129 4.773 -1.284 0.199
(Dispersion parameter for binomial family taken to be 1)
Null deviance: 8.9974 on 7 degrees of freedom
Residual deviance: 6.5741 on 6 degrees of freedom
AIC: 10.574
Number of Fisher Scoring iterations: 5
I suspect something got messed up in your reading of the data and R
thought purbank was a factor or character. Always check your data after
reading in, and str() is a your friend here as printed representations
are not always what they seem.
HTH
G
>
>
> THE DATA: (200 boxes total, used [0 if unoccupied, 1 occupied], the rest
> are landscape variables).
>
> box use purbank purban2 purban1 pgrassk pgrass2 pgrass1 grassdist grasspatchk
> 1 1 0.003813435 0.02684564 0.06896552 0.3282487 0.6845638 0.7586207 0 3.73
> 2 1 0.04429451 0.1610738 0.1724138 0.1534174 0.3825503 0.6551724 0 1.023261
> 3 1 0.04458785 0.06040268 0 0.1628043 0.557047 0.7586207 0 0.9605769
> 4 1 0.06072162 0.2080537 0.06896552 0.01936052 0 0 323.1099 0.2284615
> 5 0 0.6080962 0.6979866 0.6896552 0.03168084 0.1275168 0.2413793 30 0.2627027
> 6 1 0.6060428 0.6107383 0.3448276 0.04077442 0.2885906 0.4482759 30 0.2978571
> 7 1 0.3807568 0.4362416 0.6896552 0.06864183 0.03355705 0 94.86833 0.468
> 8 0 0.3649164 0.3154362 0.4137931 0.06277501 0.1275168 0 120 0.4585714
>
> THE CODE:
>
> box.use<- read.csv("c:\\eabl\\2004\\use_logistic2.csv", header=TRUE)
> attach(box.use)
> box.use <- na.omit(box.use)
> use <- factor(use, levels=0:1)
> levels(use) <- c("unused", "used")
> glm1 <- glm(use ~ purbank, binomial)
>
> THE OUTPUT:
>
> Coefficients:
> Estimate Std. Error z value Pr(>|z|)
> (Intercept) -4.544e-16 1.414e+00 -3.21e-16 1.000
> purbank0 2.157e+01 2.923e+04 0.001 0.999
> purbank0.001173365 2.157e+01 2.067e+04 0.001 0.999
> purbank0.001466706 2.157e+01 2.923e+04 0.001 0.999
> purbank0.001760047 6.429e-16 2.000e+00 3.21e-16 1.000
> purbank0.002346729 2.157e+01 2.923e+04 0.001 0.999
> purbank0.003813435 2.157e+01 2.923e+04 0.001 0.999
> purbank0.004106776 2.157e+01 2.067e+04 0.001 0.999
> purbank0.004693458 2.157e+01 2.067e+04 0.001 0.999
>
>
> ****************************************
> Jeffrey A. Stratford, Ph.D.
> Postdoctoral Associate
> 331 Funchess Hall
> Department of Biological Sciences
> Auburn University
> Auburn, AL 36849
> 334-329-9198
> FAX 334-844-9234
> http://www.auburn.edu/~stratja
>
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
Gavin Simpson [t] +44 (0)20 7679 0522
ECRC & ENSIS, UCL Geography, [f] +44 (0)20 7679 0565
Pearson Building, [e] gavin.simpsonATNOSPAMucl.ac.uk
Gower Street, London [w] http://www.ucl.ac.uk/~ucfagls/cv/
London, UK. WC1E 6BT. [w] http://www.ucl.ac.uk/~ucfagls/
%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%~%
More information about the R-help
mailing list