[R] problem with 'predict'
djmuser at gmail.com
Mon Jul 11 19:22:18 CEST 2011
The data frame you submit as newdata = to predict() has to have the
same variables as the right hand side of the model formula. For
example, if the model has covariates x1, x2, x3, then the data frame
you create as the newdata has to consist of columns named x1, x2, x3.
Another problem is that you want to combine all the variables into a
data frame if you intend to use the predict() method, something like
mdata <- data.frame(flags, data1)
fit1 <- glm(flags ~ ., data = mdata, family = binomial)
The prediction data frame for newdata then has to have the same
variable names as those in data1.
On Mon, Jul 11, 2011 at 8:51 AM, Meesters, Christian <meesters at aesku.com> wrote:
> I would like to tabulate the likelihood for an affection. For this, I retrieve indices of affected people and controls for my data set and proceed as follows:
> flags <- c(rep(1, length(patient_indices)), rep(0, length(control_indices)))
> # dataset is a data.frame and param the parameter to be analysed:
> data1 <- dataset[,param][c(patient_indices, control_indices)]
> fit1 <- glm(flags ~ data1, family = binomial)
> new.data <- seq(0, 300, 10)
> new.p <- predict(fit1, data.frame(newdata = new.data), type = "response")
> Which than gives data not in dependence of new.data and a warning which reads
> "Warning message:
> 'newdata' had 31 rows but variable(s) found have 306 rows"
> In a similar script new.p were data ranging from 1 to 31 with the cumulative likelihood associated with them. Now new.p looks a bit like random numbers assigned to a list ranging from 1 to 306. (306 is the number of datapoints in data1.) Unfortunately I am unable to spot the difference of the two scripts.
> I would appreciate any pointer on my mistake (and hope that my problem was understandable).
> R-help at r-project.org mailing list
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
More information about the R-help