[R] rcs fits in design package

Frank E Harrell Jr f.harrell at vanderbilt.edu
Wed Sep 30 22:01:25 CEST 2009


Hayes, Rachel M wrote:
> Hi all,
> 
>  
> 
> I have a vector of proportions (post_op_prw) such that
> 
>  
> 
>  >summary(amb$post_op_prw)
> 
>  
> 
>    Min. 1st Qu.  Median    Mean 3rd Qu.    Max.    NA's 
> 
>  0.0000  0.0000  0.0000  0.3985  0.9134  0.9962  1.0000 
> 
>  
> 
>> summary(cut2(amb$post_op_prw,0.0001))
> 
>  
> 
> [0.0000,0.0001) [0.0001,0.9962]            NA's 
> 
>            1904            1672                                      1
> 
>  
> 
> I want to use post_op_prw as a predictor variable in an OLS model.  I
> decided to fit it using a restricted cubic spline.  But, I'm seeing
> behavior I don't understand.  See below:
> 
>  
> 
>> rcspline.eval(amb$post_op_prw,nk = 3, knots.only = T)
> 
> [1] 0.0000000 0.6147927 0.9092937 0.9667178
> 
> Warning message:
> 
> In rcspline.eval(amb$post_op_prw, nk = 3, knots.only = T) :
> 
>   could not obtain 3 knots with default algorithm.
> 
>  Used alternate algorithm to obtain 4 knots
> 
>> rcspline.eval(amb$post_op_prw,nk = 4, knots.only = T)
> 
> [1] 0.0000000 0.8476793 0.9783558
> 
>> rcspline.eval(amb$post_op_prw,nk = 5, knots.only = T)
> 
> [1] 0.0000000 0.9012711 0.9783558
> 
>  
> 
> Why are the 4 and 5 knot spline requests returning a spline with 3
> knots?  I get the best model results using rcs(amb$post_op_prw,3).   I'm
> kind of new to using splines.  Does the fact that observations are
> clustered at the ends make the spline fit questionable?  

Yes, or at least it makes the choice of knots questionable.  For that 
type of variable with many ties I tend to use a quadratic effect 
(pol(x,2) in Design or rms packages).

Frank

> 
>  
> 
> Thanks,
> 
>  
> 
> Rachel Hayes
> 
> 
> 	[[alternative HTML version deleted]]
> 
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
> 


-- 
Frank E Harrell Jr   Professor and Chair           School of Medicine
                      Department of Biostatistics   Vanderbilt University




More information about the R-help mailing list