[R-sig-eco] evaluating multiple responses using chi square

Steve Brewer jbrewer at olemiss.edu
Mon Jun 15 13:21:25 CEST 2015


David,

Thanks for this. Very helpful. Yes, I was hoping to maybe deal with lack
of independence in a separate step by taking into account correlations
among a few select species, particularly the more common ones.

I'll be interested to see what your group comes up with.

Best,
Steve

J. Stephen Brewer 
Professor 
Department of Biology
PO Box 1848
 University of Mississippi
University, Mississippi 38677-1848
 Brewer web page - http://home.olemiss.edu/~jbrewer/
FAX - 662-915-5144
Phone - 662-202-5877




On 6/13/15 6:37 AM, "David Warton" <david.warton at unsw.edu.au> wrote:

>Hi Steve,
>Yes you are right in what you say, and it looks like you have identified
>the problem already - you can sum chi-square random variables to obtain a
>chi-square variable whose df is the sum of the df's of component
>variables, but only if they are mutually independent.
>
>Community datasets, with abundances or presence-absences from multiple
>taxa collected at the same place, are commonly referred to as
>multivariate precisely because the multiple responses are typically
>dependent, and hence statistics calculated for separate response
>variables are also dependent.  You can still get somewhere with the
>theory though - the sum of dependent chi-squares could be re-expressed as
>a weighted sum of chi-squares - but the weightings couldn't be estimated
>reliably unless you had lots of information in the data from replicate
>observations, which we tend not to.  This is the reason we went for
>resampling.  (The main alternative, which we are currently looking at, is
>covariance modelling as a strategy to estimate and account for
>correlation in a parsimonious way.)
>
>All the best
>David
>
> 
>David Warton
>Professor and Australian Research Council Future Fellow
>School of Mathematics and Statistics and the Evolution & Ecology Research
>Centre
>The University of New South Wales NSW 2052 AUSTRALIA
>phone (61)(2) 9385-7031
>fax (61)(2) 9385-7123
> 
>http://www.eco-stats.unsw.edu.au/ecostats15.html
>
>
>
>
>----------------------------------------------------------------------
>
>Date: Wed, 10 Jun 2015 09:57:37 -0500
>From: Steve Brewer <jbrewer at olemiss.edu>
>To: <r-sig-ecology at r-project.org>
>Subject: [R-sig-eco] evaluating multiple responses using chi square
>Message-ID: <D19DBA91.384E6%jbrewer at olemiss.edu>
>Content-Type: text/plain; charset="UTF-8"
>
>Dear Listserv community,
>
>I realize that this is more a statistical theory question, rather an R
>application question, but I hope those familiar the theory underlying
>manyglm and manylm could help me.
>
>In evaluating the overall response of a community to a treatment, I'm
>aware that one class of approaches involves doing univariate analyses for
>each species (e.g., ANOVA, t-test, chi square, logistic modeling, etc)
>and then "summing" the results across all species and evaluating
>statistical significance with a randomization procedure.
>
>My question is has anyone considered using a chi square test instead of
>randomization to obtain the significance value? The p value for any test
>statistic (F, t) can be converted to a chi square value with 1 df.
>Because chi square values are additive (assuming independence), it makes
>sense to me that you could simply add up the chi square values for all
>species and evaluate the significance of the resulting sum assuming a df
>equal to the number of tests (species). Presumably, one could use
>different tests for different species, depending on whichever is most
>appropriate (e.g., anova for common species that differ in abundance
>between treatments or chi square or a logistic model for species that
>differ in terms of frequency of occurrence between treatments). If one
>were confident that the univariate assumptions held for each species'
>test, other than the assumption of independence of responses among
>species, I'm wondering what if anything is wrong with such an approach
>for obtaining a significance value. Perhaps something similar is being
>done when the log likelihood value is calculated?
>If so, what are the similarities or differences?
>
>Thanks, and I apologize if this question is too basic or has already been
>answered.
>
>Steve
>
>
>
>
>J. Stephen Brewer
>Professor
>Department of Biology
>PO Box 1848
> University of Mississippi
>University, Mississippi 38677-1848
> Brewer web page - http://home.olemiss.edu/~jbrewer/ FAX - 662-915-5144
>Phone - 662-202-5877
>
>
>
>	[[alternative HTML version deleted]]
>
>
>
>------------------------------
>
>Subject: Digest Footer
>
>_______________________________________________
>R-sig-ecology mailing list
>R-sig-ecology at r-project.org
>https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>
>------------------------------
>
>End of R-sig-ecology Digest, Vol 87, Issue 6
>
>_______________________________________________
>R-sig-ecology mailing list
>R-sig-ecology at r-project.org
>https://stat.ethz.ch/mailman/listinfo/r-sig-ecology



More information about the R-sig-ecology mailing list