[R-sig-phylo] Multiple regressions with continuous and categorical data
Simon Blomberg
s.blomberg1 at uq.edu.au
Wed Apr 9 01:40:30 CEST 2008
On Tue, 2008-04-08 at 15:20 -0700, Joe Felsenstein wrote:
> On Mon, Apr 07, 2008 at 12:38:59PM +1000, Simon Blomberg wrote:
> > We have very good reasons for suspecting this. Brownian motion
> > (actually, the Weiner process) is almost certainly a poor model for the
> > evolution of any trait you care to name. It assumes a linear
> > relationship between variance and time, there are no bounds to
> > evolution, and although the process is continuous it is nowhere
> > differentiable, making the calculation of instantaneous rates of
> > evolution impossible.
>
> Genetic drift has much the same effect. Brownian motion is a poor
> model, and so is Ornstein-Uhlenbeck, but just as democracy is the
> worst method of organizing a society "except for all the others",
> so these two models are all we've really got that is tractable.
> Critics will be admitted to the event, but only if they carry with
> them another tractable model.
This is a line of research that I am pursuing. :-)
Simon.
>
> > > Anyone can verify that IC and PGLS are the same thing. You can
> > > translate the matrix equations to equations to look like
> > > Felsenstein's, you just write out the terms of the matrix as branch
> > > lengths and multiply them out. They will be the same.
> >
> > Then please do this! The problem is that Felsenstein presented an
> > algorithm, not a set of equations relating the response and predictor
> > variables. I'm sure you could do what you say, but as it involves the
> > choleski decomposition of the vcv matrix, which is itself a complicated
> > linear transformation of the branch lengths, I don't think it is
> > trivial. But if it is, I would like to see the proof!
>
> The contrasts can be thought of as matrices of this form: if you take
> the contrast between species 1 and 2, the transform is the matrix:
>
> 1 -1 0 0 ... 0
> a (1-a) 0 0 ... 0
> 0 0 1 0 ... 0
> 0 0 0 1 ... 0
> ...
> 0 0 0 0 ... 1
>
> (so that there is a little 2x2 box and the rest is the identity matrix).
> Of course I made it look easy by having it be species 1 and 2 but the
> principle is the same.
>
> The result is that the vector of observations now has item 1 independent
> of the others, with a variance v_1+v_2 (I am using the LaTeX notation which
> indicates subscript by underscore). The remaining variables 2, 3, ..., n
> have a tree with one fewer tip. You continue this. If you call the
> contrast matrices (like the above) C_1, C_2, ... C_{n-1}
> then the net transform is
> C_{n-1} C_{n-2} ... C_2 C_1
> and this leaves us with a diagonal covariance matrix. So getting the
> transform that diagonalizes the covariance matrix is easy. In these tree
> matrices, they are easy to diagonalize in O(n^2) operations instead of
> O(n^3) as usual. And yes, they are a nonlinear function of the branch
> lengths which is most easily defined recursively.
>
> In effect, the contrasts are just a convenient computational scheme for
> diagonalizing the covariance matrix.
>
> Joe (old enough to remember how to delete lines from a reply)
> ----
> Joe Felsenstein joe at gs.washington.edu
> Department of Genome Sciences and Department of Biology,
> University of Washington, Box 355065, Seattle, WA 98195-5065 USA
--
Simon Blomberg, BSc (Hons), PhD, MAppStat.
Lecturer and Consultant Statistician
Faculty of Biological and Chemical Sciences
The University of Queensland
St. Lucia Queensland 4072
Australia
Room 320 Goddard Building (8)
T: +61 7 3365 2506
http://www.uq.edu.au/~uqsblomb
email: S.Blomberg1_at_uq.edu.au
Policies:
1. I will NOT analyse your data for you.
2. Your deadline is your problem.
The combination of some data and an aching desire for
an answer does not ensure that a reasonable answer can
be extracted from a given body of data. - John Tukey.
More information about the R-sig-phylo
mailing list