[R-sig-Geo] bbox 4-dimensional objects - suitable method?

Michael Sumner mdsumner at gmail.com
Mon Nov 23 21:20:34 CET 2009


I would encourage the idea that higher dimensions (or even the first
two)  are not treated as representing any particular real world
dimensions. There's no reason any axis should be aligned to a
particular type or name (such as or POSIXct, or "X"). This would mean
that bbox() simply returns a 2 row matrix with more columns - as it
does now for the coordinates slot. You can always add further
functions that specially extract the range from the third or fourth
column of the bounding box, and higher level classes that treat time
or elevation as somehow special.

I aim to rewrite the trip classes in this way - trip should really
have time, and potentially Z, temperature, salinity, etc. on the
coordinates, and then treat aggregations of points/lines by ID with a
different mechanism. R doesn't have unit types as such, but it would
be good to aim for this support in the future by keeping unit, name,
and dimension separated and not aligned to particular geometric axes.

NetCDF has a complete separation of the definition of each dimension
(lon, temperature, time, lat, level etc.) that is available for a
particular variable for example, although I think the special X=1,
Y=2, Z=3, Time=4 ordering is encouraged, even if one of the dimensions
is singleton.

Regards, Mike.

On Tue, Nov 24, 2009 at 3:33 AM, Edzer Pebesma
<edzer.pebesma at uni-muenster.de> wrote:
> Torleif,
>
> after creating the SpatialPointsDataFrame (90 seconds) with your
> commands I get indeed a new object dummy, which is 400+ Mb of size, the
> same as the data.frame had. Then, converting it in a grid with
>
> gridded(dummy) = TRUE
> fullgrid(dummy) = TRUE
>
> takes again very long, but gives you the object of 100 Mb, of size very
> similar to your Spatial3dArray.
>
> The fast way of creating such objects, for which you know the dimensions
> ahead of time, with sp classes, is:
>
> library(sp)
> # I ignored your offset and cell sizes here:
> grd = GridTopology(rep(0,3), rep(1,3), c(146,188,504))
> object.size(SpatialGridDataFrame(grd,data.frame(z=rep(0,prod(grd at cells.dim)))))
>
> which produces a similar object in the same amount of time. So far, when
> each dimension is regularly discretized, I see little advantage in your
> approach, except that it gives meaning (in terms of a reference system)
> to the third dimension, being time -- my SpatialTimeGridDataFrame did a
> similar thing, although less explicitly.
>
> Your implementation is more general in the sense that it can deal with
> regular (and even irregular) time series for irregularly spaced points,
> as you store all spatial locations of measurements (once) and assume the
> same time discretization (regular or irregular -- you store the times)
> holds for each spatial points. That is of course a much more flexible
> data structure, useful for many space-time data.
>
> You might also want to read into the arguments that seq and rep have in
> addition to those you use. Confusing at first sight, but powerful -- no
> need for matrices.
> --
> Edzer
>
> Torleif Markussen Lunde wrote:
>> Of course, if you decide to use the columns as time, you reduce the space, but
>> than, with more than one variable, the structure would be "messy".
>>
>> Best wishes
>> Torleif
>>
>> On Monday 23 November 2009 15:57:32 Torleif Markussen Lunde wrote:
>>
>>> Hi Edzer
>>>
>>> Thanks for a good answer. I did read the asdar-book, and especially the
>>> section on time grids. I guess the book (together with the sp source) is a
>>> must when dealing with the sp-classes.
>>>
>>> As far as I understand this means that in SpatialTimeGridDataFrame
>>>  coordinates and time has to be stored several times. That is my main
>>>  motivation to define a new class. Let's take one example where one has a
>>>  structure similar to SpatialPointsDataFrame, while the other has a
>>>  structure of Spatial3dArray.
>>>
>>> ## SpatialPointsDataFrame ##
>>> # (since it takes a long time to convert it to a sppdf, this is omitted)
>>>
>>> # Create coordinates
>>> long2 <- seq(35, 48, length.out = 146)
>>> lat2 <- seq(2, 15, length.out = 188)
>>> long <- matrix(long2, length(long2), length(lat2), byrow = FALSE)
>>> lat <- matrix(lat2, length(long2), length(lat2), byrow = TRUE)
>>> long3 <- rep(c(long), 504)
>>> lat3 <- rep(c(lat), 504)
>>> ptc.coord <- cbind(x = long3, y=lat3)
>>>
>>> # Create data variable
>>> z <- seq(1, 10000, length.out = dim(ptc.coord)[1])
>>>
>>> # Create time
>>> tm <- rep(seq(as.POSIXct("2002-01-01 06:00:00"),
>>>        as.POSIXct("2002-01-22 06:00:00"),
>>>        length.out = 504), each = dim(long)[1]*dim(long)[2])
>>>
>>> # make data.frame
>>> dummy <- as.data.frame(cbind(ptc.coord, z, tm))
>>> # do not run:
>>> # system.time(coordinates(dummy)=~x+y+tm)
>>>
>>> # What is the object size?
>>> objs <- object.size(dummy)
>>> print(objs, quote = FALSE, units = "Mb")
>>> ## end spatial points ##
>>>
>>> ## Spatial3dArray ##
>>>
>>> # arrange z as an array
>>> array.data <- array(z, c(length(long2), length(lat2), 504))
>>> # create a time vector
>>> tm2 <- seq(as.POSIXct("2002-01-01 06:00:00"),
>>>        as.POSIXct("2002-01-22 06:00:00"),
>>>        length.out = 504)
>>>
>>> x <- new("Spatial3dArray",
>>>        data = array.data,
>>>        coords = list(long=long, lat=lat),
>>>        bbox = matrix(c(min(long),
>>>                        min(lat),
>>>                        max(long),
>>>                        max(lat)), 2, 2,
>>>        dimnames = list(c("long", "lat"), c("min","max"))),
>>>        time = as.character(tm2),
>>>        btime = c("2002-01-01 06:00:00",
>>>                         "2002-01-22 06:00:00"))
>>>
>>> dimnames(slot(x, "data"))=list(NULL,
>>>                              NULL,
>>>                              slot(x, "time"))
>>>
>>>
>>> objs2 <- object.size(x)
>>> print(objs2, quote = FALSE, units = "Mb")
>>>
>>> ## end Spatial3dArray ##
>>>
>>> So, I think it makes sense to have other classes when dealing with time
>>>  data with a certain dimension. Agree?
>>>
>>> I have already rewritten overlay, summary, and added spmean, spmax, and
>>>  spmin (with options on which dimensions the method should be applied
>>>  over). In addition binding methods (sp3dAbind equal to cbind for
>>>  data.frames. Will also introduce timeBind, or similar) has been
>>>  introduced.
>>>
>>> Best wishes
>>> Torleif
>>>
>>> On Monday 23 November 2009 14:45:16 Edzer Pebesma wrote:
>>>
>>>> Torleif,
>>>>
>>>> sp classes SpatialPoints* SpatialPixels* and SpatialGrid* already allow
>>>>
>>>> 3- and higher dimensional data; providing an example with nonsense data:
>>>>
>>>>> library(sp)
>>>>> data(meuse)
>>>>> meuse$z = rnorm(155)
>>>>> coordinates(meuse)=~x+y+z
>>>>> summary(as(meuse, "SpatialPoints"))
>>>>>
>>>> Object of class SpatialPoints
>>>> Coordinates:
>>>>             min          max
>>>> x 178605.000000 1.813900e+05
>>>> y 329714.000000 3.336110e+05
>>>> z     -2.168683 3.006463e+00
>>>> Is projected: NA
>>>> proj4string : [NA]
>>>> Number of points: 155
>>>>
>>>> It can be used this way to do 3D interpolation (package gstat), and
>>>> several of the sp methods work; obviously, several do ignore everything
>>>> beyond x and y.
>>>>
>>>> If you store your time as double, you can even think about adding that
>>>> as one of the coordinates.
>>>>
>>>> Chapter 6 of http://www.asdar-book.org/ gives code examples of this
>>>> approach, where a third dimension represents (posix) time, and methods
>>>> are given to select a certain spatial grid from a
>>>> SpatialTimeGridDataFrame based on its posix time (range).
>>>>
>>>> The motivation for your approach becomes stronger when time cannot be
>>>> stored as double (numeric); I find it harder to see the motivation to
>>>> store z differently from the x and y coordinates, create a separate
>>>> bounding box for it and rewrite all methods.
>>>> --
>>>> Edzer
>>>>
>>>> Torleif Markussen Lunde wrote:
>>>>
>>>>> Hi
>>>>>
>>>>> As previously mentioned I am working on 3D and 4D spatial classes. To
>>>>> get things compatible with the other sp-classes I would like to ask for
>>>>> your opinion what would be the most suitable bbox methods for
>>>>> Spatial3dArrays and Spatial4dArrays.
>>>>>
>>>>> My first thought was that bbox should return the 2D geographical extent
>>>>> of the object. This to comply with other spatial methods. For the 3D
>>>>> case an additional slot, btime is added to show the temporal extent of
>>>>> the object. As writing the 4D case, I started wondering whether it
>>>>> would be wise to stick to this (my conclusion at the moment is yes). In
>>>>> that case a new slot called zextent could be added.
>>>>>
>>>>> To retrieve the extent of the different dimensions one would have three
>>>>> functions; bbox(), btime(), and zextent()
>>>>>
>>>>> The other option is to make a bbox slot as a list. bbox() would still
>>>>> return x-y extent, while bbox.full() could return a list of the full
>>>>> extent:
>>>>>
>>>>> list(bbox = matrix(c(1,1,4,4), 2,2,
>>>>>                            dimnames = list(c("long", "lat"),
>>>>>                    c("min", "max"))),
>>>>>     btime = matrix(c("2002-01-01 06:00:00", "2002-01-01 06:00:00"), 1,
>>>>> 2, dimnames = list(c("time"),
>>>>>                    c("min", "max"))),
>>>>>     zextent = matrix(c(512, 1024), 1,2,
>>>>>                            dimnames = list(c("masl"),
>>>>>                    c("min", "max"))))
>>>>>
>>>>> Since bbox is defined as a matrix in Spatial, this would be a bad idea.
>>>>> So, at best bbox could be a matrix of min/max of x, y, and z (since
>>>>> they all are numeric). bbox() would then return a x, y subset of
>>>>> bbox(), while bbox.full() could return a list(xyz, time).
>>>>>
>>>>> Any comments on what would be best suitable to be compatible with the
>>>>> other Spatial classes?
>>>>>
>>>>> Best wishes
>>>>> Torleif
>>>>> PhD candidate
>>>>> Centre for International Health
>>>>> Bjerknes Centre for Climate Research
>>>>> University of Bergen
>>>>> Norway
>>>>>
>>>>> _______________________________________________
>>>>> R-sig-Geo mailing list
>>>>> R-sig-Geo at stat.math.ethz.ch
>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at stat.math.ethz.ch
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>>>
>
> --
> Edzer Pebesma
> Institute for Geoinformatics (ifgi), University of Münster
> Weseler Straße 253, 48151 Münster, Germany. Phone: +49 251
> 8333081, Fax: +49 251 8339763 http://ifgi.uni-muenster.de/
> http://www.springer.com/978-0-387-78170-9 e.pebesma at wwu.de
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at stat.math.ethz.ch
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>



More information about the R-sig-Geo mailing list