[R] problem with 'predict'

Mon Jul 11 17:51:53 CEST 2011

Hi,

I would like to tabulate the likelihood for an affection. For this, I retrieve indices of affected people and controls for my data set and proceed as follows:

flags <- c(rep(1, length(patient_indices)), rep(0, length(control_indices)))
# dataset is a data.frame and param the parameter to be analysed:
data1  <- dataset[,param][c(patient_indices, control_indices)] 
fit1 <- glm(flags ~ data1, family = binomial)
new.data    <- seq(0, 300, 10)
new.p   <- predict(fit1, data.frame(newdata = new.data), type = "response") 

Which than gives data not in dependence of new.data and a warning which reads
"Warning message:
'newdata' had 31 rows but variable(s) found have 306 rows"

In a similar script new.p were data ranging from 1 to 31 with the cumulative likelihood associated with them. Now new.p looks a bit like random numbers assigned to a list ranging from 1 to 306. (306 is the number of datapoints in data1.) Unfortunately I am unable to spot the difference of the two scripts.

I would appreciate any pointer on my mistake (and hope that my problem was understandable).

TIA
Christian