[R-sig-Geo] bbox 4-dimensional objects - suitable method?

Edzer Pebesma edzer.pebesma at uni-muenster.de
Mon Nov 23 17:33:41 CET 2009


Torleif,

after creating the SpatialPointsDataFrame (90 seconds) with your
commands I get indeed a new object dummy, which is 400+ Mb of size, the
same as the data.frame had. Then, converting it in a grid with

gridded(dummy) = TRUE
fullgrid(dummy) = TRUE

takes again very long, but gives you the object of 100 Mb, of size very
similar to your Spatial3dArray.

The fast way of creating such objects, for which you know the dimensions
ahead of time, with sp classes, is:

library(sp)
# I ignored your offset and cell sizes here:
grd = GridTopology(rep(0,3), rep(1,3), c(146,188,504))
object.size(SpatialGridDataFrame(grd,data.frame(z=rep(0,prod(grd at cells.dim)))))

which produces a similar object in the same amount of time. So far, when
each dimension is regularly discretized, I see little advantage in your
approach, except that it gives meaning (in terms of a reference system)
to the third dimension, being time -- my SpatialTimeGridDataFrame did a
similar thing, although less explicitly.

Your implementation is more general in the sense that it can deal with
regular (and even irregular) time series for irregularly spaced points,
as you store all spatial locations of measurements (once) and assume the
same time discretization (regular or irregular -- you store the times)
holds for each spatial points. That is of course a much more flexible
data structure, useful for many space-time data.

You might also want to read into the arguments that seq and rep have in
addition to those you use. Confusing at first sight, but powerful -- no
need for matrices.
--
Edzer

Torleif Markussen Lunde wrote:
> Of course, if you decide to use the columns as time, you reduce the space, but 
> than, with more than one variable, the structure would be "messy".
>
> Best wishes
> Torleif
>
> On Monday 23 November 2009 15:57:32 Torleif Markussen Lunde wrote:
>   
>> Hi Edzer
>>
>> Thanks for a good answer. I did read the asdar-book, and especially the
>> section on time grids. I guess the book (together with the sp source) is a
>> must when dealing with the sp-classes.
>>
>> As far as I understand this means that in SpatialTimeGridDataFrame
>>  coordinates and time has to be stored several times. That is my main
>>  motivation to define a new class. Let's take one example where one has a
>>  structure similar to SpatialPointsDataFrame, while the other has a
>>  structure of Spatial3dArray.
>>
>> ## SpatialPointsDataFrame ##
>> # (since it takes a long time to convert it to a sppdf, this is omitted)
>>
>> # Create coordinates
>> long2 <- seq(35, 48, length.out = 146)
>> lat2 <- seq(2, 15, length.out = 188)
>> long <- matrix(long2, length(long2), length(lat2), byrow = FALSE)
>> lat <- matrix(lat2, length(long2), length(lat2), byrow = TRUE)
>> long3 <- rep(c(long), 504)
>> lat3 <- rep(c(lat), 504)
>> ptc.coord <- cbind(x = long3, y=lat3)
>>
>> # Create data variable
>> z <- seq(1, 10000, length.out = dim(ptc.coord)[1])
>>
>> # Create time
>> tm <- rep(seq(as.POSIXct("2002-01-01 06:00:00"),
>> 	  as.POSIXct("2002-01-22 06:00:00"),
>> 	  length.out = 504), each = dim(long)[1]*dim(long)[2])
>>
>> # make data.frame
>> dummy <- as.data.frame(cbind(ptc.coord, z, tm))
>> # do not run:
>> # system.time(coordinates(dummy)=~x+y+tm)
>>
>> # What is the object size?
>> objs <- object.size(dummy)
>> print(objs, quote = FALSE, units = "Mb")
>> ## end spatial points ##
>>
>> ## Spatial3dArray ##
>>
>> # arrange z as an array
>> array.data <- array(z, c(length(long2), length(lat2), 504))
>> # create a time vector
>> tm2 <- seq(as.POSIXct("2002-01-01 06:00:00"),
>> 	  as.POSIXct("2002-01-22 06:00:00"),
>> 	  length.out = 504)
>>
>> x <- new("Spatial3dArray",
>> 	  data = array.data,
>> 	  coords = list(long=long, lat=lat),
>> 	  bbox = matrix(c(min(long),
>> 			  min(lat),
>> 			  max(long),
>> 			  max(lat)), 2, 2,
>> 	  dimnames = list(c("long", "lat"), c("min","max"))),
>> 	  time = as.character(tm2),
>> 	  btime = c("2002-01-01 06:00:00",
>> 			   "2002-01-22 06:00:00"))
>>
>> dimnames(slot(x, "data"))=list(NULL,
>> 				NULL,
>> 				slot(x, "time"))
>>
>>
>> objs2 <- object.size(x)
>> print(objs2, quote = FALSE, units = "Mb")
>>
>> ## end Spatial3dArray ##
>>
>> So, I think it makes sense to have other classes when dealing with time
>>  data with a certain dimension. Agree?
>>
>> I have already rewritten overlay, summary, and added spmean, spmax, and
>>  spmin (with options on which dimensions the method should be applied
>>  over). In addition binding methods (sp3dAbind equal to cbind for
>>  data.frames. Will also introduce timeBind, or similar) has been
>>  introduced.
>>
>> Best wishes
>> Torleif
>>
>> On Monday 23 November 2009 14:45:16 Edzer Pebesma wrote:
>>     
>>> Torleif,
>>>
>>> sp classes SpatialPoints* SpatialPixels* and SpatialGrid* already allow
>>>
>>> 3- and higher dimensional data; providing an example with nonsense data:
>>>       
>>>> library(sp)
>>>> data(meuse)
>>>> meuse$z = rnorm(155)
>>>> coordinates(meuse)=~x+y+z
>>>> summary(as(meuse, "SpatialPoints"))
>>>>         
>>> Object of class SpatialPoints
>>> Coordinates:
>>>             min          max
>>> x 178605.000000 1.813900e+05
>>> y 329714.000000 3.336110e+05
>>> z     -2.168683 3.006463e+00
>>> Is projected: NA
>>> proj4string : [NA]
>>> Number of points: 155
>>>
>>> It can be used this way to do 3D interpolation (package gstat), and
>>> several of the sp methods work; obviously, several do ignore everything
>>> beyond x and y.
>>>
>>> If you store your time as double, you can even think about adding that
>>> as one of the coordinates.
>>>
>>> Chapter 6 of http://www.asdar-book.org/ gives code examples of this
>>> approach, where a third dimension represents (posix) time, and methods
>>> are given to select a certain spatial grid from a
>>> SpatialTimeGridDataFrame based on its posix time (range).
>>>
>>> The motivation for your approach becomes stronger when time cannot be
>>> stored as double (numeric); I find it harder to see the motivation to
>>> store z differently from the x and y coordinates, create a separate
>>> bounding box for it and rewrite all methods.
>>> --
>>> Edzer
>>>
>>> Torleif Markussen Lunde wrote:
>>>       
>>>> Hi
>>>>
>>>> As previously mentioned I am working on 3D and 4D spatial classes. To
>>>> get things compatible with the other sp-classes I would like to ask for
>>>> your opinion what would be the most suitable bbox methods for
>>>> Spatial3dArrays and Spatial4dArrays.
>>>>
>>>> My first thought was that bbox should return the 2D geographical extent
>>>> of the object. This to comply with other spatial methods. For the 3D
>>>> case an additional slot, btime is added to show the temporal extent of
>>>> the object. As writing the 4D case, I started wondering whether it
>>>> would be wise to stick to this (my conclusion at the moment is yes). In
>>>> that case a new slot called zextent could be added.
>>>>
>>>> To retrieve the extent of the different dimensions one would have three
>>>> functions; bbox(), btime(), and zextent()
>>>>
>>>> The other option is to make a bbox slot as a list. bbox() would still
>>>> return x-y extent, while bbox.full() could return a list of the full
>>>> extent:
>>>>
>>>> list(bbox = matrix(c(1,1,4,4), 2,2,
>>>>      			dimnames = list(c("long", "lat"),
>>>> 		     	c("min", "max"))),
>>>>     btime = matrix(c("2002-01-01 06:00:00", "2002-01-01 06:00:00"), 1,
>>>> 2, dimnames = list(c("time"),
>>>> 		     	c("min", "max"))),
>>>>     zextent = matrix(c(512, 1024), 1,2,
>>>>      			dimnames = list(c("masl"),
>>>> 		     	c("min", "max"))))
>>>>
>>>> Since bbox is defined as a matrix in Spatial, this would be a bad idea.
>>>> So, at best bbox could be a matrix of min/max of x, y, and z (since
>>>> they all are numeric). bbox() would then return a x, y subset of
>>>> bbox(), while bbox.full() could return a list(xyz, time).
>>>>
>>>> Any comments on what would be best suitable to be compatible with the
>>>> other Spatial classes?
>>>>
>>>> Best wishes
>>>> Torleif
>>>> PhD candidate
>>>> Centre for International Health
>>>> Bjerknes Centre for Climate Research
>>>> University of Bergen
>>>> Norway
>>>>
>>>> _______________________________________________
>>>> R-sig-Geo mailing list
>>>> R-sig-Geo at stat.math.ethz.ch
>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>         
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo at stat.math.ethz.ch
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>>     

-- 
Edzer Pebesma
Institute for Geoinformatics (ifgi), University of Münster
Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de/
http://www.springer.com/978-0-387-78170-9 e.pebesma at wwu.de



More information about the R-sig-Geo mailing list