[R] Boxplot using a shapefile

Roger Bivand Roger.Bivand at nhh.no
Tue Jun 16 14:28:13 CEST 2015


Boris Steipe <boris.steipe <at> utoronto.ca> writes:

> 
> Your workflow in principle is:
> 
> - read the image into an object for which you can obtain values-per-pixel
in a 2D structure;
> - read the shapefile and convert into a polygon;
> - determine the bounding box of the polygon;
> - use the inout() function of the splancs package to get a list of
booleans for the
>     points in the bounding box, TRUE if they are _inside_ the polygon;
> - subset your image points to those for which inout() returns TRUE;
> - plot as boxplot().
> 
> The CRAN taskview http://cran.r-project.org/web/views/MedicalImaging.html
has a section on general
> image processing, guiding you to helpful packages.

Actually, this is the wrong taskview if the data are as described, as
Spatial data are covered in the Spatial task view at:

http://cran.r-project.org/web/views/Spatial.html

The workflow as described is also muddled: "[T]he shapefile takes the 
pixel values from the image and shows the distribution of pixels in 
the form of a boxplot" doesn't actually mean anything without further
assumptions. 

A shapefile is an ESRI file format for GIS vector geometries (and
attributes) that may be polygons, lines or points, and has an associated
coordinate reference system; it is almost never used for other kinds of data. 

The "image" - presumably a GIS raster data file, should have the same
coordinate reference system, or be transformed to the same system (use
spTransform in the rgdal package, which is also the package you should use
for reading the input data as it correctly reads input coordinate reference
systems if available). 

The operation then needed is called an over() method in the sp package, and
extract() in the raster package. 

If the shapefile contains points, the over query is asking the value(s) of
the raster cells (pixels) at those points, given the same coordinate
reference systems - but only one boxplot. If lines, for each line you may
get a vector of values from raster cells intersected by the lines, and could
make a boxplot for each line; you may wish to weight each value by the
length of line in each cell. If polygons, as lines, with weighting by
intersection area.

The over vignette in the sp package is where you need to go to begin:

http://cran.r-project.org/web/packages/sp/vignettes/over.pdf

and the introduction to the raster package as a further reference:

http://cran.r-project.org/web/packages/raster/vignettes/Raster.pdf

> 
> Ask again if you get stuck - but(!):
> - see here for some hints on how to ask questions productively:
>   http://adv-r.had.co.nz/Reproducibility.html
>  
http://stackoverflow.com/questions/5963269/how-to-make-a-great-r-reproducible-example
> - ... and please read the posting guide and don't post in HTML.
> 

Definitely! And note that this is a question that is better suited to the
R-sig-geo list.

Hope this clarifies,

Roger

> B.
> 
> On Jun 15, 2015, at 7:19 AM, Preethi Balaji <preet.balaji20 <at>
gmail.com> wrote:
> 
> > Dear all,
> > 
> > I am trying to generate boxplots by giving a shapefile and an image as
> > input. The shapefile takes the pixel values from the image and shows
> > the distribution of pixels in the form of a boxplot.
> > 
> > Can somebody please tell me how I can execute this in R?
> > 
> > Many thanks!
> > 
> > -- 
> > 
> > Regards,
> > Preethi Malur Balaji | PhD Student
> > University College Cork | Cork, Ireland.



More information about the R-help mailing list