[R] diagnostic functions to assess fitted ols() model: Confidence is too narrow?!

Frank E Harrell Jr f.harrell at vanderbilt.edu
Sat Dec 17 14:03:40 CET 2005


Jan Verbesselt wrote:
> Dear all,
> 
> When fitting an "ols.model", the confidence interval at 95% doesn't cover
> the plotted data points because it is very narrow.
> 
> Does this mean that the model is 'overfitted' or is there a specific amount
> of serial correlation in the residuals?
> 
> Which R functions can be used to evaluate (diagnostics) major model
> assumptions (residuals, independence, variance) when fitting ols models in
> the Design package?
> 
> Regards,
> Jan

Confidence intervals for means are not supposed to cover the data 
points.  This interval shrinks to zero as the sample size goes to 
infinity.  Confidence intervals that are 'individual' should cover the 
majority of data points.

You can see the case study on ols in my book for examples of 
diagnostics.  See biostat.mc.vanderbilt.edu/rms

Frank Harrell

> 
> # -->OLS regression
>     library(Design)
>     ols.1 <- ols(Y~rcs(X,3), data=DATA, x=T, y=T)
>     summary.lm(ols.1)  # --> non-linearity is significant
>     anova(ols.1)
>     
>     d <- datadist(Y,X)
>     options(datadist="d")  
>     plot(ols.1)
>     #plot(ols.1, conf.int=.80, conf.type=c('individual'))
>     points(X,Y)
>     scat1d(X, tfrac=.2)
> 
> When plotting this confidence interval looks normal:     
> #plot(ols.1, conf.int=.80, conf.type=c('individual'))
> 
> Workstation Windows XP
> // R version 2.2 //
> 
> 
> 
> 
> Disclaimer: http://www.kuleuven.be/cwis/email_disclaimer.htm
> 
> ______________________________________________
> R-help at stat.math.ethz.ch mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide! http://www.R-project.org/posting-guide.html
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list