[R-sig-phylo] comparative analysis using multiple regression of contrasts?

jclaude at univ-montp2.fr jclaude at univ-montp2.fr
Wed May 25 11:38:32 CEST 2011

Thanks Simon,

this is exactly the answer I was looking for.

for type II sums of squares you are right, however, when there are  
multiple factors pics and gls usually still provide different F values  
and p-values even if you set the marginality stuff in the analysis of  
variance. (they are not much different, but i still wonder why results  
are the same with one explanatory variable but different when you  
consider several).

  As I see it, contrast analyses through the origin are not a so usual  
regressions since no intercept is estimated: they however result in  
similar output when only one explanatory variable is included.  
Although I did not investigate type I and type II error rate when the  
response was continuous and the explanatory variable was dummy, I  
still guess that there are still something to do for modifying  
ancestral character state reconstruction in the contrast analysis for  
the dummy variable and computing their contrast: it is hard to believe  
that the brownian model will apply to that dummy variable because the  
expected variance of such a character can not be properly gaussian,  
but certainly more following something that has to see with the  
binomial law and logit family link. I wonder if someone has worked in  
this direction for contrasts...if not, this is probably something  
interesting i would try to investigate.

all the best


Simon Blomberg <s.blomberg1 at uq.edu.au> a écrit :

> On 24/05/11 01:38, Julien Claude wrote:
>> Dear all,
>> I have one factor and several covariates and I would like to know which
>> of these explanatory variables are more likely to explain a response
>> variable in a comparative analysis. The factor is a dummy variable (0-1).
>> At first, my strategy would be to use contrasts of all variables and
>> then producing a type II anova on contrasts, I am however not sure that
>> I am right in doing so with the dummy variable.
>> data(bird.orders)
>> Y<-rnorm(23)
>> X1<-rbinom(23,1,0.5)
>> X2<-rnorm(23)
>> pY<-pic(Y, bird.orders)
>> pX1<-pic(X1, bird.orders)
>> pX2<-pic(X2, bird.orders)
>> library(car)
>> Anova(lm(pY~pX1*pX2-1))
>> Can we directly use the contrast for the dummy variable?, or should we
>> transform this contrast in specifying some special stuff behind the
>> expected ancestral values (The dummy variable probably does not follow a
>> brownian motion model...). The solution may be here but I have no clear
>> idea about how to transform it.
> Yes, you can directly use the contrasts of the dummy variable. There  
> is no problem with non-Brownian explanatory variables in pic  
> regression. The usual regression model assumes all the covariance is  
> in the response variable only. The calculation of contrasts for the  
> explanatory variable is a necessary step to getting the correct  
> parameter estimates. No further meaning should be attached.  
> (Although perhaps I hold an extreme view on this.)
>> I wonder also how to adopt this strategy by using gls.
>> DF.bird<- data.frame(X1,X2, Y)
>> bm.bird<- corBrownian(phy = bird.orders)
>> m1<- gls(Y~X1*X2, correlation = bm.bird, data = DF.bird)
>> m2<- gls(Y~X2*X1, correlation = bm.bird, data = DF.bird)
>> anova(m1)
>> How to get the mean squares  and variance estimate in pgls ? With a one
>> factor analysis F-values are exactly the same, but when the number of
>> covariates is greater than 1, F values  can differ in some extent, I
>> wonder why.
> that's because by default, the analysis gives you "sequential" sums  
> of squares tests (Type I for the SAS people). To get type II  
> (marginal) tests, which are not affected by the order of fitting, use
> anova(m1, type="marginal")
>>  In the gls stuff, I understand that incorporating the
>> expected correlation of the response should follow a BM, but this is
>> certainly not true for all the predictors)?
> It doesn't matter for the predictor variables. The covariances of  
> the predictor variables is not used in the calculation of the gls  
> regression. If you want to take the covariances of the predictors  
> into account, you need to use an errors-in-variables model. Or you  
> could calculate non-central correlations ("correlation through the  
> origin") instead of regression.
> Simon.
>> Let me know if this strategy makes sense or if it is flawed....Thanks
>> for your input
>> Julien
>> -------------------
>> Julien CLAUDE
>> Institut des Sciences de l'Evolution, 34095 Montpellier cedex 5
>> Université de Montpellier 2
>> _______________________________________________
>> R-sig-phylo mailing list
>> R-sig-phylo at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo
> -- 
> Simon Blomberg, BSc (Hons), PhD, MAppStat.
> Lecturer and Consultant Statistician
> School of Biological Sciences
> The University of Queensland
> St. Lucia Queensland 4072
> Australia
> T: +61 7 3365 2506
> email: S.Blomberg1_at_uq.edu.au
> http://www.uq.edu.au/~uqsblomb/
> Policies:
> 1.  I will NOT analyse your data for you.
> 2.  Your deadline is your problem
> Statistics is the grammar of science - Karl Pearson.
> _______________________________________________
> R-sig-phylo mailing list
> R-sig-phylo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-phylo

More information about the R-sig-phylo mailing list