[R] problem with 'predict'

Mon Jul 11 19:50:28 CEST 2011

On Jul 11, 2011, at 11:51 AM, Meesters, Christian wrote:

> Hi,
>
> I would like to tabulate the likelihood for an affection. For this,  
> I retrieve indices of affected people and controls for my data set  
> and proceed as follows:
>
> flags <- c(rep(1, length(patient_indices)), rep(0,  
> length(control_indices)))
> # dataset is a data.frame and param the parameter to be analysed:
> data1  <- dataset[,param][c(patient_indices, control_indices)]
> fit1 <- glm(flags ~ data1, family = binomial)
> new.data    <- seq(0, 300, 10)
> new.p   <- predict(fit1, data.frame(newdata = new.data), type =  
> "response")

Should (probably)  have been ... names of RHS variables need to be  
exact match:

new.p   <- predict(fit1, newdata= data.frame(data1 = new.data), type =  
"response")

(Obviously untested.)
>
> Which than gives data not in dependence of new.data and a warning  
> which reads
> "Warning message:
> 'newdata' had 31 rows but variable(s) found have 306 rows"
>
> In a similar script new.p were data ranging from 1 to 31 with the  
> cumulative likelihood associated with them. Now new.p looks a bit  
> like random numbers assigned to a list ranging from 1 to 306. (306  
> is the number of datapoints in data1.) Unfortunately I am unable to  
> spot the difference of the two scripts.
>
> I would appreciate any pointer on my mistake (and hope that my  
> problem was understandable).
>
> TIA
> Christian
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.

David Winsemius, MD
West Hartford, CT