[R-sig-Geo] original coordinates of the redwood data?

Adrian Baddeley adrian.baddeley at uwa.edu.au
Fri Dec 6 04:06:52 CET 2013


Lee De Cola <ldecola at comcast.net> writes:

> does anyone know the original, true coordinates of the spatstat redwoodfull data?
>  i can't find them in:
> Strauss, D. J. (1975). "A Model for Clustering." Biometrika 62(2): 467-475.

> i think reporting spatial data in rescaled units is unscientific. 

Please first read the help file for the dataset. 
(I think it is unprofessional to send out a question before you have read the documentation %^])

The help file says that these data were scanned manually from a reprint of Strauss's paper
and that the dimensions of the figure are only *approximately* known to be about 128 feet across.

The paper by Strauss does not give the exact dimensions of the figure, and does not give the
original coordinates. The data are lost, according to David Strauss. Hence we were obliged to
scan a reprint - which has the risk of inaccuracy (coincident points may be lost; the x and y axes
may have been unequally scaled, nonlinearly scaled, etc).

If you are confident enough about the dimensions, it is easy to rescale the data:
       X <- rescale(redwoodfull, 128)
       unitname(X) <- c("foot", "feet")
In spatstat we do not attribute physical scales to datasets unless the scale is reliable,
and that is why 'redwoodfull' is not scaled to 128 feet.

> i think reporting spatial data in rescaled units is unscientific. 

Sure, the redwood dataset does not meet modern standards of data integrity.
The redwood patterm is a 'legacy' dataset that  has been widely used and re-used
 over the years to compare different methods (a bit like Fisher's iris data or the sunspot series). 
That is why it is included in spatstat.

In the 1970's it was actually standard practice to rescale point patterns to the unit square, to avoid
numerical difficulties (for example all the data in Peter Diggle's 1983 book are rescaled to the unit square).
 Also it was a lot harder to obtain data, and scientists were even more protective of their data than they are now, 
so spatial data were not widely shared, and when shared  it was often 'anonymised' by removing the spatial scale 
so that competitors could not re-analyse the data. 

Adrian Baddeley
author of 'spatstat'

Prof Adrian Baddeley FAA
University of Western Australia 


More information about the R-sig-Geo mailing list