[R] warning with glm.predict, wrong number of data rows
Charles Berry
ccberry at ucsd.edu
Thu May 3 18:19:00 CEST 2012
carol white <wht_crl <at> yahoo.com> writes:
>
> Hi,
> I split a data set into two partitions (80 and 42), use the first as the
training set in glm and the second as
> testing set in glm predict. But when I call glm.predict, I get the warning
message:
>
> Warning message:
> 'newdata' had 42 rows but variable(s) found have 80 rows
> ---------------------
[snip]
The warning correctly diagnoses the problem.
The posting guide asks for a 'reproducible example', but you did not give us one.
There is one below.
Note what happens when predict() tries to reconstruct the variable 'x[1:4]'
as dictated by the formula.
How many elements can 'x[1:4]' have when newdata has (say) nrowsNew?
Use the subset argument to select a subset of observations.
> y <- sample(factor(1:2),80,repl=T)
> y <- sample(factor(1:2),5,repl=T)
> x <- 1:4
> fit <- glm( y[1:4] ~ x[1:4], family = binomial)
> fit
Call: glm(formula = y[1:4] ~ x[1:4], family = binomial)
Coefficients:
(Intercept) x[1:4]
-1.110e-16 0.000e+00
Degrees of Freedom: 3 Total (i.e. Null); 2 Residual
Null Deviance: 5.545
Residual Deviance: 5.545 AIC: 9.545
> predict(fit,newdata=data.frame(x=1:2))
1 2 3 4
-1.110223e-16 -1.110223e-16 NA NA
Warning message:
'newdata' had 2 rows but variable(s) found have 4 rows
> predict(fit,newdata=data.frame(x=1:5))
1 2 3 4
-1.110223e-16 -1.110223e-16 -1.110223e-16 -1.110223e-16
Warning message:
'newdata' had 5 rows but variable(s) found have 4 rows
>
HTH,
Chuck
[rest deleted]
More information about the R-help
mailing list