[R-sig-Geo] gridding data prior to spatial analyses

Jones, Daniel O.B. dj1 at noc.ac.uk
Tue Apr 1 21:50:50 CEST 2014


Dear all,
I have a reasonably large database (~2500 points) of biomass values (the response variable) with associated positional information (lat / long) in the Atlantic. I want to look at potential environmental explanatory variables. I have several environmental datasets associated with each point (e.g. temperature, salinity, oxygen, organic carbon etc.). The data are spatially patchy and some locations (e.g. the North Sea) have a lot of data in a small area, while other areas have sparse data (e.g. the central Atlantic). I wanted to use spatial simultaneous autoregressive error modelling (errorSARlm in the spdep package) in R to assess how the biomass varies with each of the potential explanatory variables. In many other analyses I have seen the data are gridded prior to analysis. This leads to several questions:
1)      Should I grid the data? This dramatically reduces the available number of observations from around 2500 to around 150 (geometric mean biomass in 5 degree grid cells), but solves the problem of unequal data distribution. Are there any references that provide a recommendation for this?
2)      If I grid the data, should I grid the data at a higher resolution i.e. with lots of smaller cells (e.g. 1 degree). This will result in a sparse coverage (i.e. lots of holes) and lower number of observations per cell but will increase the accuracy and precision of the environmental data (which can vary dramatically over a 5 degree grid) and will increase the number of cells in the analysis (presumably increasing statistical power).
3)      If I grid the data, should I pick a minimum number of observations per cell and exclude the cells that do not meet this criteria. Other papers exclude grid cells where the number of observations is lower than a set value (determined, for example, by assessing how relative standard errors decrease with the number of observations).
I would greatly appreciate any advice from someone familiar with these analyses, particularly if you know of any published papers that back up the approach.
Many thanks, Daniel

This message (and any attachments) is for the recipient ...{{dropped:6}}



More information about the R-sig-Geo mailing list