[R-sig-Geo] misalignment between data projected in R and ArcGIS
Helen Sofaer
helen at rams.colostate.edu
Wed Sep 25 20:23:24 CEST 2013
Dear list,
I’m having some trouble with projected ESRI shapefiles I imported into
R; they don’t align with files that I projected within R without going
through ArcGIS. I’ve tried to define the projection (Albers) as
precisely as I can using rgdal and the proj4string definitions, but
the problem persists. I don’t see this problem if I bring in the data
in NAD83, and then project in within R, so I can get a workaround that
way, but at this point I’m very curious to know what inane mistake I'm
making to mess this up.
Here’s an example: it uses data for future climate scenarios that can
be downloaded online and a state boundary shapefile from ESRI. (My
analysis actually uses shapefiles of data locations and survey area
boundaries, so I can’t just skip importing them and simply using map
data within R).
I apologize that it isn’t fully replicable for those of you that don’t
have ArcGIS (I’ve tried to include some output to compensate); I also
apologize for the length!
Can anyone tell me where I’ve gone wrong?
Thanks,
Helen
library(sqldf)
library(tcltk)
library(sp)
# website for Coulson et al. future climate data:
# http://www.fs.usda.gov/rds/archive/Product/RDS-2010-0017/
# select just one month of data from future A2 scenario, limiting to
bounding envelope of study area:
## Note that a smaller area could be imported to show the issue (but
bbox output below will differ)
CGCM.A2.May2047 = read.csv.sql("Cgcm31_a2_Dec09_2041_2050.csv",
sql = "select * from file where lati_dd > 42.4 and long_dd > -113.417
and long_dd < -96.4 and year = 2047 and month = 5",
dbname = tempfile())
# show where points are (just based on decimal degrees, before even
defining coordinate system):
library(ggplot2)
library(maps)
StateBorders = map_data('state', region = c('south dakota', 'north
dakota', 'montana'))
p = ggplot(CGCM.A2.May2047, aes(long_dd, lati_dd)) +
geom_point(aes(size = ppt_mm)) + scale_size(range = c(0,3)) +
theme_classic()
p + geom_polygon(data = StateBorders, aes(long, lat, group = group,
fill = NA, colour = "blue", alpha = 0)) + theme(legend.position =
"none")
# This plot is hideous (code to change polygon fill and border color
didn't work...) but not worrying about it b/c purpose is to show
general location of data
## Convert to SPDF and define coordinate system as NAD83 (NOTE THAT
COORDINATE SYSTEM ISN'T EXPLICIT IN METADATA, NAD83 is based on their
source data)
CGCM.A2.May2047.SPDF = SpatialPointsDataFrame(data = CGCM.A2.May2047,
coords = CGCM.A2.May2047[, c("long_dd", "lati_dd")],
proj4string = CRS(as.character("+proj=longlat
+datum=NAD83")))
is.projected(CGCM.A2.May2047.SPDF)
proj4string(CGCM.A2.May2047.SPDF)
## Project coordinates to Albers within R:
library(rgdal)
CGCM.A2.May2047.Albers = spTransform(CGCM.A2.May2047.SPDF,
CRS("+proj=aea +lat_1=20 +lat2=60 +lat0=40 +lon_0=-96 +x_0=0 +y_0=0
+datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"))
is.projected(CGCM.A2.May2047.Albers)
proj4string(CGCM.A2.May2047.Albers)
str(CGCM.A2.May2047.Albers)
#### Show the problem I'm having with the projections when I bring
shapefiles in from GIS:
# This was downloaded from ESRI:
http://www.arcgis.com/home/item.html?id=540003aa59b047d7a1f465f7b1df1950
# Took subset of 3 states of interest and projected to Albers within ArcGIS
MT.ND.SD.boundaries.Albers.viaGIS =
readOGR("C:/Helen_PPR_GIS/ForPondModel/SpatialExtent",
"StateBoundaries_MT_ND_SD_Albers")
print(proj4string(MT.ND.SD.boundaries.Albers.viaGIS))
# [1] "+proj=aea +lat_1=20 +lat_2=60 +lat_0=40 +lon_0=-96 +x_0=0
+y_0=0 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"
### THE ISSUE:
# These coordinates somehow appear to have different projections -
i.e. y coordinates aren't anywhere near each other, when they should
overlap:
bbox(MT.ND.SD.boundaries.Albers.viaGIS)
# min max
# x -1402104.5 -33543.5
# y 293949.4 1208872.9
bbox(CGCM.A2.May2047.Albers)
# min max
# long_dd -1700889 -39977.89
# lati_dd 4561044 5192583.31
## Show that ArcGIS data align ok with themselves
# I input climate data into ArcCatalog as XY table (defining
coordinate system as NAD83), projected to Albers within ArcCatalog,
and then input shapefile into R:
CGCM.A2.May2047.Albers.viaGIS =
readOGR("C:/Helen_PPR_GIS/ForPondModel/SpatialExtent",
"CGCM_A2_May2047_viaGIS_Albers")
print(proj4string(CGCM.A2.May2047.Albers.viaGIS))
# [1] "+proj=aea +lat_1=20 +lat_2=60 +lat_0=40 +lon_0=-96 +x_0=0
+y_0=0 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"
bbox(CGCM.A2.May2047.Albers.viaGIS)
# min max
# coords.x1 -1339945.8 -28821.09
# coords.x2 285437.4 1172923.00
## The two files from ArcGIS do overlap, although there are some
points that extend farther north into Canada than I would have
expected:
plot(MT.ND.SD.boundaries.Albers.viaGIS)
points(spsample(CGCM.A2.May2047.Albers.viaGIS, n = 100, type =
"random")) # easier to see with fewer points
## What if I import from GIS with the data in NAD83, rather than
already projected to Albers?
CGCM.A2.May2047.NAD83.viaGIS =
readOGR("C:/Helen_PPR_GIS/ForPondModel/SpatialExtent",
"CGCM_A2_May2047_viaGIS")
print(proj4string(CGCM.A2.May2047.NAD83.viaGIS))
bbox(CGCM.A2.May2047.NAD83.viaGIS)
# Project to Albers within R:
CGCM.A2.May2047.Albers.viaGISinNAD83 =
spTransform(CGCM.A2.May2047.NAD83.viaGIS, CRS("+proj=aea +lat_1=20
+lat2=60 +lat0=40 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m
+no_defs +ellps=GRS80 +towgs84=0,0,0"))
bbox(CGCM.A2.May2047.Albers.viaGISinNAD83)
# min max
# coords.x1 -1700889 -39977.89
# coords.x2 4561044 5192583.31
# this does align with the other data projected within R; so bringing
in unprojected shapefiles doesn't seem to be a problem
--
Helen Sofaer
Postdoctoral Fellow
Fish Wildlife and Conservation Biology
Colorado State University
More information about the R-sig-Geo
mailing list