Hi,

Modeling the spatial correlation of PD, FD _and_ the trends in PD, FD plus the effect of  environmental and anthropogenic correlate in a formal likelihood-based geostatistical analysis, might be too ambitious for something to be done with data at the global scale in a few months. This is not only because of learning about likelihood-based geostatistics (a daunting task in itself) but because the data and models you intend to build might be too much for the numerical algorithms that will have to handle maximization of the corresponding multivariate likelihood. You didn't give info about the sample size (i.e. number of stations with measurements of PD, FD, plus latitude-longitude, plus environmental and anthropogenic forcings) but because you speak of ecoregions on a global scale, I imagine the sample size(s) is(are) large. You might be better off sticking to the spatial regression framework using gls.

Regarding your original question, 

(1) using likfit() (geoR package, continuous response) or likfit.glsm() (geoRglm package, binary or counts reponse) you _can_ both use latitude and longitude (actually the transformation to Euclidian space, easting and northing) to model spatial correlations _and_ to model the gradients, i.e. you can use the two location predictors to model spatial correlation and to model a structural relation written in the formula object,

_however_,

(2) since I have advised not to go that way I've checked the documentation about generalized least squares in package nlme and I think you can include your two location predictors in the corExp function _and_ use them as predictors in the formula object when calling gls(). In fact you can see in the example at the bottom of the gls() help page that authors have put time in the correlation function (corAR1) and in the formula object.

HTH

Rubén


Rubén H. Roa-Ureta, Ph. D.
AZTI Tecnalia, Txatxarramendi Ugartea z/g,
Sukarrieta, Bizkaia, SPAIN


-----Original Message-----
From: Peter Francis [mailto:peterfrancis@me.com]
Sent: Sat 4/23/2011 7:35 PM
To: Rubén Roa
Cc: Andy Rominger; r-sig-ecology@r-project.org
Subject: Re: [R-sig-eco] Detecting spatial gradient whilst accounting for spatial autocorrelation
 
Ruben and Andy,

Thanks very much for this and the time taken - i do appreciate it.

I am relatively new to geostats, actually i have only been doing this for 2 months as part of a chapter of my thesis that kind of appeared - this was never part of the plan but seemed interesting -  I am more comfortable with phylogenetics.

I am presenting this at the Evolution conference in Oklahoma in a few months and am worried i am entering a world i know very little about and therefore i may make stupid mistakes. 

My data is presence / absence of species in ecoregions on a global scale and i have calculated the functional and phylogenetic quantity of each metric for each community, i have environmental and anthropogenic data for each ecoregion from worldclim and ESRI I have two main questions i wish to answer : what are the environmental and anthropogenic correlates of the observed distribution of PD and FD and is their spatial element to their distribution. So i have a question; would it be greatly beneficial to the validity and accuracy of the results to dedicate time learning the terminology ( neighbour hoods, krigs,  phi etc)  and literature ( i am busy at present but am willing to) or will using a gls method i have done be appropriate and give reliable and statistically sound results?

Thanks again for both of your help, i am sure your methods are far better than mine and i am grateful for your time and help

Peter


On 23 Apr 2011, at 09:01, Rubén Roa wrote:

Peter, Andy,

The geoR package has the function likfit() that will fit a likelihood-based geostatistical model that can include a trend model as well (the spatial gradient you refer to, if I understand correctly) and can handle the spatial correlation by specifying a general Matern function or particular cases such as gaussian and exponential. The geoR package deals with continuous response variables; I think your diversity response measures are continuous. The package goeRglm is the extension to binary responses (binomial) and to counts (Poisson). Both are well documented.

HTH

Ruben

-----------------------------------------
Rubén H. Roa-Ureta, Ph. D.
AZTI Tecnalia, Txatxarramendi Ugartea z/g,
Sukarrieta, Bizkaia, SPAIN


-----Original Message-----
From: r-sig-ecology-bounces@r-project.org on behalf of Andy Rominger
Sent: Fri 4/22/2011 10:37 PM
To: Peter Francis
Cc: r-sig-ecology@r-project.org
Subject: Re: [R-sig-eco] Detecting spatial gradient whilst accounting for spatial autocorrelation

Hi Peter,

First a question: are your plant communities sampled at unique "points" (say
areas on the order of 50ha) or are they compilations from entire ecoregions?

If the first (unique "points") then you log-transform your diversities you
could probably fit a classical geo statistical model with lat and lon as
covariates, as well as include a variance-covariance matrix with non-zero
off-diagonal entries (i.e. autocorrelation).  You can do this two ways (that
I know of): (1) build a naive model with least squares, extract the
residuals from that and fit a co-variogram (directly related to the
variance-covariance) to those residuals, then re-calculate the coefficients
with GLS. You can do this with package gstat using the variogram and
fit.variogram functions (NOTE: you'll be subjectively fitting a variogram by
the "eyeball" method).  For that and other reasons, this isn't such a great
approach, in my opinion.

OR (2) find both the variance-covariance matrix AND the coefficients in one
step using maximum likelihood.  I don't know of a pre-packaged function to
do this, but basically you'll need to write out the (log) likelihood and
then use optim or something like that maximize it.  To get you started,
we're assuming your data (after log-transformation) come from a multivariate
normal (one "dimension" for every data point) and that multivariate normal
(MVN) has a mean vector that depends on your covariates (lat, lon, temp,
etc) in a linear way; polynomials would be fine (e.g. lat^2 if you expect
more richness at the equator).  The MVN would also have covariances that
depended on the geographic distance between your points, you can see the
documentation ?variogram for some ideas about what kind of covarianve
functions are commonly used.

As if that's not enough, another option (one that
dosen't necessitate log-transforming your diversity data!) would be to use a
hierarchical Bayesian model where, for example, the diversity at any given
site is Poisson (and conditionally independent from other sites), but the
mean of each site comes from a Gaussian process.  The upside is that a
package exists for doing this; it's called geoRglm.

Finally, if you've collected data for all the ecoregions, this is another
problem entirely.  You'll probably want to do a spatial autoregressive model
(of the conditional--CAR--or simultaneous--SAR--variety).  Check out package
spdep for some useful functions, including some to create "neighborhoods"
(i.e. polygons that share a boundary) that are needed for AR models.

Sorry that was such a long tirade, I hope some of it was useful.  You might
also want to hit up the r-sig-geo list for help on spatial statistics.  On a
side note (to justify my long tirade perhaps) your ecological question
sounds pretty cool, I would use an equally "cool"
(sophisticated) statistical method to make sure you get the right answer.

Good luck,
Andy


On Wed, Apr 20, 2011 at 7:25 AM, Peter Francis <peterfrancis@me.com> wrote:

> Dear List
>
> I have a data set which shows high levels of spatial autocorrelation, the
> sampled plant communities are on a global scale and i am looking for
> correlates of different diversity measures (PD,FD).
>
> I have controlled for the spatial autocorrelation using a GLS model where
> the latitude and longitude of each ecoregion was included as a smooth factor
> in all-statistical models using the corExp function.
>
> i.e
>
> exponential <-corExp(form = ~ Longitude + Latitude)
>
> PD_global <- gls(PD~Elevation+Temperature+Species_Richness, correlation =
> exponential)
>
> Now i want to ask the question; how do these diversity measure vary with
> latitude and longitude i.e is their a spatial gradient to PD or FD, i don't
> think i can put in longitude or latitude as a predictor if i am using it as
> a random variable through the corExp function
>
> i.e
>
> PD_global <- gls(PD~Elevation+Temperature+Species_Richness, ALtitude +
> Longitude, correlation = exponential)
>
>
> But i imagine it is something that is often asked?
>
> Any help would be greatly appreciated.
>
> Thanks in advance
>
> Peter
>
> _______________________________________________
> R-sig-ecology mailing list
> R-sig-ecology@r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-ecology
>

        [[alternative HTML version deleted]]

_______________________________________________
R-sig-ecology mailing list
R-sig-ecology@r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-ecology


	[[alternative HTML version deleted]]