[R] genralized linear regression - function glm - number of

David Winsemius dwinsemius at comcast.net
Thu Nov 18 17:37:10 CET 2010


On Nov 18, 2010, at 11:00 AM, Christine SINOQUET wrote:

> Hello,
>
> Performing a linear regression through the function glm ("yi ~ X$V1  
> + X$V2 + X$V3 + X$V4 + X$V5 + X$V6 + X$V7 + X$V8 + X$V9 + X$V10"), I  
> then edit the information about the coefficients:
>
> print(coefficients(summary(fit)))
>
> I note that the number of coefficients (7) is lower than the number  
> of predictors (10).
> In this case, I work on simulated data for which I forced yi to be a  
> linear function of the 10 predictors.
>

What code was used to make the simulation?

> intercept: 0.0180752965003802
> predictor 1: -0.0111046268531608
> predictor 2: -0.0185366138753851
> predictor 3: 0.107341157096227
> predictor 4: 0.00162924662836275
> predictor 5: 0.00162924629403743
> predictor 6: -0.0171999854554059
> predictor 7: -0.0171999856835917
> predictor 8: -0.057207682945982
> predictor 9: -0.0171999856239631
> predictor 10: 0.134643228957395
>
>
> "yi ~ X$V1 + X$V2 + X$V3 + X$V4 + X$V5 + X$V6 + X$V7 + X$V8 + X$V9 +  
> X$V10"
>               Estimate   Std. Error       t value Pr(>|t|)
> (Intercept)  0.018062134 5.624517e-17  3.211322e+14        0
> X$V1        -0.011104627 3.084989e-17 -3.599567e+14        0
> X$V2        -0.018536614 3.241635e-17 -5.718291e+14        0
> X$V3         0.107341157 4.884358e-17  2.197651e+15        0
> X$V4         0.003258493 3.286878e-17  9.913643e+13        0
> X$V6        -0.051599957 4.203840e-17 -1.227448e+15        0
> X$V8        -0.057207683 3.049835e-17 -1.875763e+15        0
> X$V10        0.134643229 3.849911e-17  3.497308e+15        0
>
>
> I am sure to have regressed the right number of variables, since I  
> check that the formula is correct:
> "yi ~ X$V1 + X$V2 + X$V3 + X$V4 + X$V5 + X$V6 + X$V7 + X$V8 + X$V9 +  
> X$V10"
>
> Could somebody explain to me
> 1) why there are mismatches between the "true" coefficients for  
> predictors 4 and 6
> and

Your std errors are incredibly small (effectively zero from a  
numerical perspective) suggesting you have created a dataset with  
extremely small amounts of noise. The coefficients are different (than  
expected) because of the answer to the next question.

> 2) why there is no information edited for predictors 5, 7 and 9 ?

You most likely had each of those set up as a linear combination of  
the retained predictors. Collinear variables are dropped and usually  
there is a warning, bust since you have not given a console session I  
cannot be sure.

--

David Winsemius, MD
West Hartford, CT



More information about the R-help mailing list