[R-sig-eco] Regression with few observations per factor level

V. Coudrain v_coudrain at voila.fr
Mon Oct 20 18:02:48 CEST 2014


Yes, but as I fear, the residuals behave badly as soon as the model get a little bit more complex (e.g., with two covariables or an interactions). The scope for performing an ANCOVA is thus very limited. That's why I was thinking about a potential non-parametric model. But I do not want to artificially makes my data tell something if it cannot.




> Message du 20/10/14 à 16h50
> De : "stephen sefick" 
> A : "Martin Weiser" 
> Copie à : "V. Coudrain" , "r-sig-ecology" 
> Objet : Re: [R-sig-eco] Regression with few observations per factor level
> 
> You are more or less preforming an ANOVA/ANCOVA on your data? As pointed out earlier, all of the normal theory regression assumptions apply. Assuming all of those things are satisfied then if you have large confidence intervals and there are significant differences between groups I don't see why you couldn't correctly infer something about the treatments. Maybe I am missing something.
> Stephen 
> On Mon, Oct 20, 2014 at 8:43 AM, Martin Weiser  wrote:
> Hi,
> 
> coefficients and their p-values are reliable if your data are OK and you
> do know enough about the process that generated them, so you can choose
> appropriate model. With 4 points per line, it may be really difficult to
> identify bad fit or outliers.
> 
> For example: simple linear regression needs constant variance of the
> normal distribution from which residuals are drawn -  along the
> regression line - to work properly.  With 4 points, you can hardly
> estimate this, but if you know enough about the process that generated
> the data, you are safe. If you do not know, it is not easy to say
> anything about the nature of the process that generated the data.
> 
> If you know (or can assume) that there is simple linear relationship,
> you can say: "slope of this relationship is such and such", but if you
> want to estimate both the nature of the relationship ("A *linearly*
> depends on B") and its magnitude ("the slope of this relationship
> is ..."), p-values would not help you much.
> 
> Of course, I may be wrong - I am not a statistician, just a user.
> 
> Best,
> Martin W.
> 
> 
> V. Coudrain píše v Po 20. 10. 2014 v 13:37 +0200:
> > Thank you very much. If I get it right, the CI get wider, my test has less power and the probability of getting a significant relation decreases. What about the significant coefficients, are they reliable?
> >
> >
> >
> >
> > > Message du 20/10/14 à 11h30
> > > De : "Roman Luštrik"
> > > A : "V. Coudrain"
> > > Copie à : "r-sig-ecology at r-project.org"
> > > Objet : Re: [R-sig-eco] Regression with few observations per factor level
> > >
> > > I think you can, but the confidence intervals will be rather large due to number of samples.
> > > Notice how standard errors change for sample size (per group) from 4 to 30.
> > > > pg <- 4 # pg = per group> my.df <- data.frame(var = c(rnorm(pg, mean = 3), rnorm(pg, mean = 1), rnorm(pg, mean = 11), rnorm(pg, mean = 30)), +                     trt = rep(c("trt1", "trt2", "trt3", "trt4"), each = pg), +                     cov = runif(pg*4)) # 4 groups> summary(lm(var ~ trt + cov, data = my.df))
> > > Call:lm(formula = var ~ trt + cov, data = my.df)
> > > Residuals:     Min       1Q   Median       3Q      Max -1.63861 -0.46080  0.03332  0.66380  1.27974
> > > Coefficients:            Estimate Std. Error t value Pr(>|t|)    (Intercept)   1.2345     1.0218   1.208    0.252    trttrt2      -0.7759     0.8667  -0.895    0.390    trttrt3       7.8503     0.8308   9.449  1.3e-06 ***trttrt4      28.2685     0.9050  31.236  4.3e-12 ***cov           1.4027     1.1639   1.205    0.253    ---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> > > Residual standard error: 1.154 on 11 degrees of freedomMultiple R-squared:  0.9932,Adjusted R-squared:  0.9908 F-statistic: 404.4 on 4 and 11 DF,  p-value: 7.467e-12
> > > > > pg <- 30 # pg = per group> my.df <- data.frame(var = c(rnorm(pg, mean = 3), rnorm(pg, mean = 1), rnorm(pg, mean = 11), rnorm(pg, mean = 30)), +                     trt = rep(c("trt1", "trt2", "trt3", "trt4"), each = pg), +                     cov = runif(pg*4)) # 4 groups> summary(lm(var ~ trt + cov, data = my.df))
> > > Call:lm(formula = var ~ trt + cov, data = my.df)
> > > Residuals:    Min      1Q  Median      3Q     Max -2.5778 -0.6584 -0.0185  0.6423  3.2077
> > > Coefficients:            Estimate Std. Error t value Pr(>|t|)    (Intercept)  2.76961    0.25232  10.977  < 2e-16 ***trttrt2     -1.75490    0.28546  -6.148 1.17e-08 ***trttrt3      8.40521    0.28251  29.752  < 2e-16 ***trttrt4     27.04095    0.28286  95.599  < 2e-16 ***cov          0.05129    0.32523   0.158    0.875    ---Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1
> > > Residual standard error: 1.094 on 115 degrees of freedomMultiple R-squared:  0.9913,Adjusted R-squared:  0.991 F-statistic:  3269 on 4 and 115 DF,  p-value: < 2.2e-16
> > > On Mon, Oct 20, 2014 at 10:53 AM, V. Coudrain  wrote:
> > > Hi, I would like to test the impact of a treatment of some variable using regression (e.g. lm(var ~ trt + cov)).  However I only have four observations per factor level. Is it still possible to apply a regression with such a small sample size. I think that i should be difficult to correctly estimate variance.Do you think that I rather should compute a non-parametric test such as Kruskal-Wallis? However I need to include covariables in my models and I am not sure if basic non-parametric tests are suitable for this. Thanks for any suggestion.
> > > ___________________________________________________________
> > > Mode, hifi, maison,… J'achète malin. Je compare les prix avec
> > >         [[alternative HTML version deleted]]
> > >
> > > _______________________________________________
> > > R-sig-ecology mailing list
> > > R-sig-ecology at r-project.org
> > > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> > >
> > >
> >
> > > --
> > > In God we trust, all others bring data.
> >
> > ___________________________________________________________
> > Mode, hifi, maison,… J'achète malin. Je compare les prix avec
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > R-sig-ecology mailing list
> > R-sig-ecology at r-project.org
> > https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> 
> 
> 
> 
> --
> 
> ------------------------------
> Pokud je tento e-mail součástí obchodního jednání, Přírodovědecká fakulta
> Univerzity Karlovy v Praze:
> a) si vyhrazuje právo jednání kdykoliv ukončit a to i bez uvedení důvodu,
> b) stanovuje, že smlouva musí mít písemnou formu,
> c) vylučuje přijetí nabídky s dodatkem či odchylkou,
> d) stanovuje, že smlouva je uzavřena teprve výslovným dosažením shody na
> všech náležitostech smlouvy.
> 
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
> 
> 

> -- 
> Stephen Sefick
> **************************************************
> Auburn University                                         
> Biological Sciences                                      
> 331 Funchess Hall                                       
> Auburn, Alabama                                        
> 36849                                                           
> **************************************************
> sas0025 at auburn.edu                                  
> http://www.auburn.edu/~sas0025                 
> **************************************************
> 
> Let's not spend our time and resources thinking about things that are so little or so large that all they really do for us is puff us up and make us feel like gods.  We are mammals, and have not exhausted the annoying little problems of being mammals.
> 
>                                 -K. Mullis
> 
> "A big computer, a complex algorithm and a long time does not equal science."
> 
>                               -Robert Gentleman
> 
> 

___________________________________________________________
Mode, hifi, maison,… J'achète malin. Je compare les prix avec 
	[[alternative HTML version deleted]]



More information about the R-sig-ecology mailing list