[R-sig-Geo] gridded time series analysis

Robert J. Hijmans r.hijmans at gmail.com
Thu Dec 23 18:58:28 CET 2010


The problem occurs because the function sometimes returns one value
(NA) and sometimes 2 (slope and intercept). You can easily fix this as
shown below and then it should work.

fun <- function(x) {
	if (sum(is.na(x[1:12]))) {
		return(c(NA, NA))
	} else {
		resp = x[1:12]
		expl = x[13:24]
		mdl = lm(resp ~ expl)
		res = c(mdl$coef[1] , mdl$coef[2])
		return(res)
	}
}

The error message is not very helpfu;; I'll look into that.

On Wed, Dec 22, 2010 at 7:18 PM, Advait Godbole <advaitgodbole at gmail.com> wrote:
> That works Robert, thanks. There is still the issue of NA values. some grid
> cells have all values in the time series as missing (these are cells over
> the ocean, to make physical sense). Is there a way to disregard them in the
> calculation entirely? lm does have a na.action argument but it does not seem
> to handle all missing values. You had initially suggested an if condition
> within the function definition and I did this:
> fun <- function(x) { if((sum(is.na(x[1:120]))==120) ||
> (sum(is.na(x[121:240]))==120)) {NA}
>    else {
>    resp = x[1:120]
>    expl = x[121:240]
>    mdl = lm(resp ~ expl)
>    res = c(mdl$coef[1] , mdl$coef[2])
>    return(res)
>    }
> }
> # now do:
> b2 <- calc(s_crop, fun)
> modintercept <- b2[[1]]
> modslope <- b2[[2]]
> This seems to have taken care of the NAs but returned this error:
>> b2 <- calc(s_crop, fun)
> Error in setValues(out, x) : values must be numeric, integer or logical.
> Is there a problem with the way the NA is being returned after checking the
> if condition?
> Thanks,
> advait
>
> On Wed, Dec 22, 2010 at 8:17 PM, Advait Godbole <advaitgodbole at gmail.com>
> wrote:
>>
>> Dear Robert,
>> Sorry about that - will incorporate an example in future. I tried the calc
>> method and it is definitely much simpler! if "s_crop" is my raster stack for
>> the following:
>> s_crop <- stack(emitraster_crop,rcmraster_crop)
>> fun <- function(x) {
>>    resp = x[1:120]
>>    expl = x[121:240]
>>    mdl = lm(resp ~ expl)
>>    res = c(mdl$coef[1] , mdl$coef[2])
>>    return(res)
>> }
>> # now do:
>> b2 <- calc(s_crop, fun)
>> I looked at the output and it appears that b2[[1]] are the intercept and
>> b2[[2]] are coefficients. Is that right? Also, for the later part:
>> datCoef = apply(getValues(s_crop), 1, fun)
>> b1 <- setValues(s_crop, t(datCoef))
>> I get:
>> Error in function (classes, fdef, mtable)  :
>>   unable to find an inherited method for function "setValues", for
>> signature "RasterStack"
>> Is this being done to create a new Raster object of slopes and
>> intercepts?
>> Also, I dont quite understand why setValues is being used here.
>> Thanks,
>> Advait
>> On Wed, Dec 22, 2010 at 7:26 PM, Robert J. Hijmans <r.hijmans at gmail.com>
>> wrote:
>>>
>>> Advait,
>>>
>>> When asking for help, please provide a self-contained example whenever
>>> possible, as below.
>>>
>>> library(raster)
>>> v1 <- matrix(rep(1:12, 16), nrow=16)
>>> v2 <- matrix(rep(1:12, 16), nrow=16, byrow=TRUE)
>>> b <- brick(ncol=4, nrow=4)
>>> b1 <- setValues(b, v1)
>>> b2 <- setValues(b, v2)
>>> s <- stack(b1, b2)
>>>
>>> fun <- function(x) {
>>>    resp = x[1:12]
>>>    expl = x[13:24]
>>>    mdl = lm(resp ~ expl)
>>>    res = c(mdl$coef[1] , mdl$coef[2])
>>>    return(res)
>>> }
>>>
>>> # now do:
>>> b2 <- calc(s, fun)
>>>
>>> # as an alternative to the above you could do
>>> # but I would not recommend this
>>>
>>> datCoef = apply(getValues(s), 1, fun2)
>>> b1 <- setValues(s, t(datCoef))
>>>
>>> # I think you were complicating matters with plyr.
>>> # But to get those values in a brick you could do something like
>>> b1 <- setValues(s, as.matrix(datCoef)[,3:4])
>>> # But I am not sure if you would get the values in the correct cells
>>> # (the number matches, but perhaps not the order).
>>>
>>> Best, Robert
>>>
>>>
>>>
>>> On Wed, Dec 22, 2010 at 10:39 AM, Advait Godbole
>>> <advaitgodbole at gmail.com> wrote:
>>> > Dear Members,
>>> > I was able to find a solution after some deliberation and posting it
>>> > here
>>> > for the benefit of forum members.
>>> > To recap, my data was in the form of 2 netCDF files, each of dimension
>>> > 356 x
>>> > 507 x 120 (lat,lon,time). I wanted to perform regressions between the
>>> > two
>>> > variables on each grid point over time. I used "brick" to read in the
>>> > netcdfs as raster layers and stack along with the "plyr" package to
>>> > perform
>>> > these regressions on the grid.
>>> > # create raster object from the netcdf
>>> > rcmraster <- brick("regcm.cumul.hdd.10k.96_05.fixed.latlonvars.nc",
>>> > varname
>>> > = "hdd")
>>> > emitraster <-
>>> >
>>> > brick("flip.percap.vulcan.10k.dp.build.381.RES.EIAmy.96_05.this.latest.nc",varname="emitpercap")
>>> > # crop subset of above rasters, test method on subset which is 4 x 4 x
>>> > 120
>>> > lim <- extent(rcmraster,180,185,300,305)
>>> > rcmraster_crop <- crop(rcmraster,lim)
>>> > emitraster_crop <- crop(emitraster,lim)
>>> > # do some plotting to check
>>> >  par(mfrow=c(1,2))
>>> >  plot(rcmraster_crop[[1]])
>>> >  plot(emitraster_crop[[1]])
>>> > # test regressions for cropped subset
>>> > s_crop <- stack(emitraster_crop,rcmraster_crop)
>>> > # use of plyr (provides a more elegant alternative to lapply)
>>> > # the data frame datCoef contains the slope and intercept for each
>>> > point in
>>> > the grid (denoted by X1 and X2)
>>> > datCoef = adply( as.array(s_crop) , 1:2 ,
>>> >               function(x ) {
>>> >                             resp = x[1:120]
>>> >                             expl = x[121:240]
>>> >                             mdl = lm(resp ~ expl)
>>> >                             res = c(mdl$coef[1] , mdl$coef[2])
>>> >                             return(res)
>>> >
>>> >                           }
>>> >                           )
>>> > Output of datCoef()
>>> >> datCoef
>>> >    X1 X2  (Intercept)         expl
>>> > 1   1  1 0.0010523406 2.623927e-05
>>> > 2   2  1 0.0019401572 5.049197e-05
>>> > 3   3  1 0.0171326653 4.425026e-04
>>> > 4   4  1 0.0012244689 3.045755e-05
>>> > 5   1  2 0.0011454348 2.892668e-05
>>> > 6   2  2 0.0013920425 3.460922e-05
>>> > 7   3  2 0.0015973400 3.943349e-05
>>> > 8   4  2 0.0016494511 4.045801e-05
>>> > 9   1  3 0.0018757258 4.704107e-05
>>> > 10  2  3 0.0038066075 9.378512e-05
>>> > 11  3  3 0.0039669120 9.719391e-05
>>> > 12  4  3 0.0038055524 9.246021e-05
>>> > 13  1  4 0.0015700329 3.921758e-05
>>> > 14  2  4 0.0003727142 9.156119e-06
>>> > 15  3  4 0.0005315722 1.308365e-05
>>> > 16  4  4 0.0005872930 1.539454e-05
>>> > Now, what I want to do is for each of these X1,X2, get a "predicted"
>>> > map
>>> > based on a new X which is an array of dimensions 4 x 4 (same as the
>>> > rasters.
>>> > in the original dataset, it will be 356 x 507). This array has the
>>> > following
>>> > values:
>>> >> futurehdda
>>> >          [,1]     [,2]     [,3]     [,4]
>>> > [1,] 5707.575 5696.325 5692.960 5683.110
>>> > [2,] 5680.303 5665.092 5660.790 5647.896
>>> > [3,] 5656.163 5641.045 5638.886 5625.231
>>> > [4,] 5634.604 5619.317 5620.226 5601.729
>>> > How can I extract the slope and intercept from datCoef and use the
>>> > above X
>>> > to get a predicted Y with the same dimensions (4 x 4)?
>>> > Thanks,
>>> > Advait
>>> > On Mon, Dec 6, 2010 at 4:57 PM, Robert J. Hijmans <r.hijmans at gmail.com>
>>> > wrote:
>>> >>
>>> >> Dear Advait,
>>> >>
>>> >> > What I dont understand is:
>>> >> > 1) How can each slope be the same (values of x1)
>>> >>
>>> >> Apparently the values in these cells are all the same (not that you
>>> >> only use the first 8 time steps, not all 120)
>>> >>
>>> >> Here is an example with random values in which the slopes are all
>>> >> different
>>> >>
>>> >> b <- brick(ncol=4, nrow=4, nl=120)
>>> >> b <- setValues(b, matrix(runif(prod(dim(b))), ncol=120))
>>> >> fun = function(x) { if(is.na(x[1])) { NA } else { lm(x[1:60] ~
>>> >> x[61:120])$coefficients[2] } }
>>> >> x <- calc(b, fun)
>>> >> plot(x)
>>> >> as.matrix(x)
>>> >>
>>> >>
>>> >> > 2) The raster layers are of dimensions 4 x 4 x 120
>>> >> > but dim(getValues(rcmraster_crop))
>>> >> > [1]  16 120. Why is that? This must have implications for the way
>>> >> > that
>>> >> > the
>>> >> > regressions for the brick are handled wouldnt it?
>>> >> No
>>> >>
>>> >> > Or is it specific to the
>>> >> > way "getValues" handles things?
>>> >>
>>> >> Yes. rows are cells (4*4=16), columns are layers.
>>> >>
>>> >> > Also, does anyone have any pointers of how to accomplish the NA
>>> >> > testing
>>> >> > and
>>> >> See an example above, assuming that if the first time step is NA, they
>>> >> all are. You could also do things like
>>> >> if (sum(is.na(x)) > 60)
>>> >> depends on your context.
>>> >>
>



More information about the R-sig-Geo mailing list