[R] two questions for R beginners

Mon Mar 1 16:24:13 CET 2010

On Mon, 01 Mar 2010 12:25:20 -0000 (GMT) Ted.Harding at manchester.ac.uk 
<Ted.Harding at manchester.ac.uk> wrote:
> > A similar type of overloading is used in the 'sp' class functions,
> > where you can basically treat a 'SpatialPointsDataFrame', a 
> > 'SpatialLinesDataFrame' or a 'SpatialPolygonsDataFrame' as a data
> > frame, 
> > with '$colname' indexing and '[' subsetting, even though the
> > *internals* 
> > of the objects have a completely different (and very complex)
> > structure. 
> > But as a convenience to the user, you don't need to bother with the 
> > internals, and can handle the object *as if* it were a data frame. It's
> > a very comfortable way of working.
> 
> I'm not sure that "SpatialPointsDataFrame" is a dataframe (despite
> the name)! Is it not simply a list? In which case, using "$" is
> what you have to do to get at its components.

That it's not a data frame is the point. :-)

And it not simply a list, it's a S4 object with the data (frame) stored 
in a 'data' slot, and '$' overloaded so you can use it *as if* it was a 
data frame. Example:

library(sp)
example("SpatialPolygonsDataFrame-class")

# Internal structure (warning: not pretty!)
str(ex_1.7$x)

# Extracting columns from the data frame
ex_1.7$z

# Both 'nrow' and '[' is overloaded, so you can use '[' 
# for normal subsetting. For example, to plot 10 random
# polygons, you can type 
ex.sub=ex_1.7[sample(nrow(ex_1.7), 10), ]
plot(ex.sub)

In most cases you don't have to worry about how everything is stored 
internally; things just work like you expect them to.

-- 
Karl Ove Hufthammer