[R-sig-Geo] misalignment between data projected in R and ArcGIS

Helen Sofaer helen at rams.colostate.edu
Wed Sep 25 20:23:24 CEST 2013


Dear list,

I’m having some trouble with projected ESRI shapefiles I imported into
R; they don’t align with files that I projected within R without going
through ArcGIS. I’ve tried to define the projection (Albers) as
precisely as I can using rgdal and the proj4string definitions, but
the problem persists. I don’t see this problem if I bring in the data
in NAD83, and then project in within R, so I can get a workaround that
way, but at this point I’m very curious to know what inane mistake I'm
making to mess this up.

Here’s an example: it uses data for future climate scenarios that can
be downloaded online and a state boundary shapefile from ESRI. (My
analysis actually uses shapefiles of data locations and survey area
boundaries, so I can’t just skip importing them and simply using map
data within R).

I apologize that it isn’t fully replicable for those of you that don’t
have ArcGIS (I’ve tried to include some output to compensate); I also
apologize for the length!

Can anyone tell me where I’ve gone wrong?

Thanks,

Helen



library(sqldf)

library(tcltk)

library(sp)



# website for Coulson et al. future climate data:

# http://www.fs.usda.gov/rds/archive/Product/RDS-2010-0017/

# select just one month of data from future A2 scenario, limiting to
bounding envelope of study area:

## Note that a smaller area could be imported to show the issue (but
bbox output below will differ)

CGCM.A2.May2047 = read.csv.sql("Cgcm31_a2_Dec09_2041_2050.csv",

sql = "select * from file where lati_dd > 42.4 and long_dd > -113.417
and long_dd < -96.4 and year = 2047 and month = 5",

                    dbname = tempfile())

# show where points are (just based on decimal degrees, before even
defining coordinate system):

library(ggplot2)

library(maps)

StateBorders = map_data('state', region = c('south dakota', 'north
dakota', 'montana'))

p = ggplot(CGCM.A2.May2047, aes(long_dd, lati_dd)) +
geom_point(aes(size = ppt_mm)) + scale_size(range = c(0,3)) +
theme_classic()

p + geom_polygon(data = StateBorders, aes(long, lat, group = group,
fill = NA, colour = "blue", alpha = 0)) + theme(legend.position =
"none")

# This plot is hideous (code to change polygon fill and border color
didn't work...) but not worrying about it b/c purpose is to show
general location of data



## Convert to SPDF and define coordinate system as NAD83 (NOTE THAT
COORDINATE SYSTEM ISN'T EXPLICIT IN METADATA, NAD83 is based on their
source data)

CGCM.A2.May2047.SPDF = SpatialPointsDataFrame(data = CGCM.A2.May2047,

                    coords = CGCM.A2.May2047[, c("long_dd", "lati_dd")],

                    proj4string = CRS(as.character("+proj=longlat
+datum=NAD83")))

is.projected(CGCM.A2.May2047.SPDF)

proj4string(CGCM.A2.May2047.SPDF)



## Project coordinates to Albers within R:

library(rgdal)

CGCM.A2.May2047.Albers = spTransform(CGCM.A2.May2047.SPDF,
CRS("+proj=aea +lat_1=20 +lat2=60 +lat0=40 +lon_0=-96 +x_0=0 +y_0=0
+datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"))

is.projected(CGCM.A2.May2047.Albers)

proj4string(CGCM.A2.May2047.Albers)

str(CGCM.A2.May2047.Albers)



#### Show the problem I'm having with the projections when I bring
shapefiles in from GIS:

# This was downloaded from ESRI:
http://www.arcgis.com/home/item.html?id=540003aa59b047d7a1f465f7b1df1950

# Took subset of 3 states of interest and projected to Albers within ArcGIS

MT.ND.SD.boundaries.Albers.viaGIS =
readOGR("C:/Helen_PPR_GIS/ForPondModel/SpatialExtent",
"StateBoundaries_MT_ND_SD_Albers")

print(proj4string(MT.ND.SD.boundaries.Albers.viaGIS))

# [1] "+proj=aea +lat_1=20 +lat_2=60 +lat_0=40 +lon_0=-96 +x_0=0
+y_0=0 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"



### THE ISSUE:

# These coordinates somehow appear to have different projections -
i.e. y coordinates aren't anywhere near each other, when they should
overlap:

bbox(MT.ND.SD.boundaries.Albers.viaGIS)

#         min       max

# x -1402104.5  -33543.5

# y   293949.4 1208872.9

bbox(CGCM.A2.May2047.Albers)

#             min        max

# long_dd -1700889  -39977.89

# lati_dd  4561044 5192583.31



## Show that ArcGIS data align ok with themselves

# I input climate data into ArcCatalog as XY table (defining
coordinate system as NAD83), projected to Albers within ArcCatalog,
and then input shapefile into R:

CGCM.A2.May2047.Albers.viaGIS =
readOGR("C:/Helen_PPR_GIS/ForPondModel/SpatialExtent",
"CGCM_A2_May2047_viaGIS_Albers")

print(proj4string(CGCM.A2.May2047.Albers.viaGIS))

# [1] "+proj=aea +lat_1=20 +lat_2=60 +lat_0=40 +lon_0=-96 +x_0=0
+y_0=0 +datum=NAD83 +units=m +no_defs +ellps=GRS80 +towgs84=0,0,0"

bbox(CGCM.A2.May2047.Albers.viaGIS)

#                 min        max

# coords.x1 -1339945.8  -28821.09

# coords.x2   285437.4 1172923.00



## The two files from ArcGIS do overlap, although there are some
points that extend farther north into Canada than I would have
expected:

plot(MT.ND.SD.boundaries.Albers.viaGIS)

points(spsample(CGCM.A2.May2047.Albers.viaGIS, n = 100, type =
"random")) # easier to see with fewer points



## What if I import from GIS with the data in NAD83, rather than
already projected to Albers?

CGCM.A2.May2047.NAD83.viaGIS =
readOGR("C:/Helen_PPR_GIS/ForPondModel/SpatialExtent",
"CGCM_A2_May2047_viaGIS")

print(proj4string(CGCM.A2.May2047.NAD83.viaGIS))

bbox(CGCM.A2.May2047.NAD83.viaGIS)

# Project to Albers within R:

CGCM.A2.May2047.Albers.viaGISinNAD83 =
spTransform(CGCM.A2.May2047.NAD83.viaGIS, CRS("+proj=aea +lat_1=20
+lat2=60 +lat0=40 +lon_0=-96 +x_0=0 +y_0=0 +datum=NAD83 +units=m
+no_defs +ellps=GRS80 +towgs84=0,0,0"))

bbox(CGCM.A2.May2047.Albers.viaGISinNAD83)

#               min        max

# coords.x1 -1700889  -39977.89

# coords.x2  4561044 5192583.31

# this does align with the other data projected within R; so bringing
in unprojected shapefiles doesn't seem to be a problem


--
Helen Sofaer
Postdoctoral Fellow
Fish Wildlife and Conservation Biology
Colorado State University



More information about the R-sig-Geo mailing list