[R] Error in robdist.hbrfit(x) : x is probably collinear

varin sacha v@r|n@@ch@ @end|ng |rom y@hoo@|r
Tue Mar 31 00:03:24 CEST 2020


Hi Bert,

Many thanks for your response. 

Yes, my toy example is not representative of the reality. You are right without colinearity my R code works.
As you say, it may be overfitting and there must be colinearity, even if for me colinearity is not really a problem it is more a nuisance. It may be important but it is not a problem (except if there is perfect collinearity). I mean if there is a high collinearity, the properties of the estimators are not modified (unbiased and minimum variance). What happens is that the variance is inflated. 

Anyway, I would like to know if it is possible to overcome this colinearity problem using this R code (hbrfit). This is my question and it may be interesting on this list because it concerns R programming and not statistical worries about overfitting and/or colinearity...

Best,







Le lundi 30 mars 2020 à 22:00:39 UTC+2, Bert Gunter <bgunter.4567 using gmail.com> a écrit : 





I know nothing about the packages in question, but do you know what
"collinear" means and how/why it can mess up model fitting? If no,
that may be the problem. See also "overfitting."

Bert Gunter

"The trouble with having an open mind is that people keep coming along
and sticking things into it."
-- Opus (aka Berkeley Breathed in his "Bloom County" comic strip )

On Mon, Mar 30, 2020 at 12:10 PM varin sacha via R-help
<r-help using r-project.org> wrote:
>
> Hi,
>
> A google search does not give me any hint :=(
> Maybe somebody can help me to fix the error message I get : Error in robdist.hbrfit(x) : x is probably collinear
>
>
> # # # # # # # # # # # # # # # # # # # ## # # # # # # # # # # # # # #
> bmi=c(23,43,21,23,45,65,45,11,12,13,23,34,NA,NA,34,35,45,65,43,23,12,11,15,43,23,88,78,79,89,89,99,43,21,34,32,45,65,76,56,45,34,23,12,32)
> glucose=c(NA,12,23,11,12,21,23,21,23,43,23,12,NA,23,11,12,32,12,14,12,11,10,9,8,9,8,7,90,76,32,12,11,12,23,11,123,32,12,14,34,54,65,76,87)
> crp=c(123,212,154,342,123,111,121,765,453,123,213,211,NA,NA,32,123,213,145,143,123,132,143,165,176,181,123,87,567,342,123,143,132,143,234,345,32,123,132,143,345,321,543,231,123)
> age=c(67,45,34,56,87,NA,NA,23,18,65,45,87,65,33,23,65,43,23,43,12,132,56,76,87,98,78,76,56,78,54,34,56,76,99,12,45,34,65,76,87,98,97,64,53)
> sex=c(0,1,1,0,1,0,1,0,0,1,1,1,NA,NA,1,1,1,1,1,1,1,1,1,1,0,0,0,0,0,0,0,0,0,0,1,1,1,0,0,0,1,1,0,0)
>
> Dataset=data.frame( bmi,glucose,crp,age,sex)
> newdata=na.omit(Dataset)
>
> install.packages( "boot",dependencies=TRUE )
> install.packages("remotes")
> remotes :: install_github("kloke/hbrfit",force=TRUE)
> install.packages( "quantreg",dependencies=TRUE  )
>
> library(boot)
> library(hbrfit)
> library(quantreg)
>
>  # function to obtain MSE
>  MSE <- function(data, indices, formula) {
>    d <- data[indices, ] # allows boot to select sample
>    fit <- hbrfit(formula, data = d)
>    ypred <- fit$fitted.values
>    mean((d[["crp"]]-ypred)^2)
>  }
>
> # bootstrapping with 100 replications
> results <- boot(data = newdata, statistic = MSE, R = 100, formula = crp ~ bmi+glucose+age+sex)
>
> str(results)
> boot.ci(results, type="norm" )
> # # # # # # # # # # # # # # # # # # # ## # # # # # # # # # # # # # #
>
> ______________________________________________
> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.



More information about the R-help mailing list