[R-sig-Geo] Translate a net-cdf file with different years to polygon regions.
Stachelek, Joseph
jstachel at sfwmd.gov
Tue Apr 19 14:42:51 CEST 2016
Since your nc file contains multiple layers, you will want to use `raster::stack()` rather than `raster::raster()`.
-----Original Message-----
From: Stachelek, Joseph
Sent: Tuesday, April 19, 2016 8:23 AM
To: 'Janka VANSCHOENWINKEL' <janka.vanschoenwinkel at uhasselt.be>; r-sig-geo at r-project.org
Subject: RE: [R-sig-Geo] Translate a net-cdf file with different years to polygon regions.
Hi Janka,
I think you can simplify your code a lot by opening your nc file directly using the `raster` package rather than messing with `nc_open` calls.
```
ncin <- raster::raster(paste(ncname, ".nc", sep = ""))
```
Then you might use `raster::extract()` to pull the values associated with your polygons. Also, I would recommend posting a link to a gist (https://gist.github.com/) rather than pasting such a long script into your email.
Joe
-----Original Message-----
From: R-sig-Geo [mailto:r-sig-geo-bounces at r-project.org] On Behalf Of Janka VANSCHOENWINKEL
Sent: Tuesday, April 19, 2016 4:45 AM
To: r-sig-geo at r-project.org
Subject: [R-sig-Geo] Translate a net-cdf file with different years to polygon regions.
*DATA*
I have a net-cdf file which has raster of 0.5 on 0.5 degrees. You can
easily download it here
<https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_3.23/cruts.1506241137.v3.23/pet/>
and
search for *cru_ts3.23.2001.2010.pet.dat.nc.gz *(it is also downloadable as
a dat.file if this is more handy to work with)
(Or simply download the net-cdf file directly through:
cru_ts3.23.2001.2010.pet.dat.nc.gz
<https://crudata.uea.ac.uk/cru/data/hrg/cru_ts_3.23/cruts.1506241137.v3.23/pet/cru_ts3.23.2001.2010.pet.dat.nc.gz>
).
I opened the file in ArcMap as well and found that the coordinate system
used is: GCS_WGS_1984. The net-cdf file contains monthly data from 2001-2010
Download from this website
<http://ec.europa.eu/eurostat/web/gisco/geodata/reference-data/administrative-units-statistical-units/nuts>
the
following ZIP file: *NUTS_2006_60M_SH.zip* and save all the *NUTS_RG_60M_2006
files *in a folder where you can easily find the back. In what follows I
refer to this as "NUTS3".
*WHAT I WANT*
- I want to add the information from the raster files to the NUTS3
shapefile (which are polygons and no rasters) in order to obtain a table
per nuts3 region for each monthtly variable.
*WHERE I AM STUCK*
The file appears to be very difficult to work with in ArcGis. Also, I will
have to repeat this process a number of times for different variables. So I
would like to have one code in R that I can run.
I have a number of separate codes to do the following:
- translate the net-cdf file to a cvs file with longitude and latitude as
identifiers (see below under 1)
- translate a cvs file with accompanying empty raster file to nuts3
regions. (this is a code that I have used before when I had a cvs file and
a raster). (see below under 2).
However, I don't have a raster file now. Well, technically I do (since the
net-ncf file is a raster) but I don't know how to use it in this format.
Can somebody help me to link the codes below or suggest a different code to
obtain what I want?
Thanks a lot!
Janka
*1) With the following code, I can make a cvs file and extract all the data
in table format.*
library(fields)
library(chron)
library(ncdf4)
ncname<-"cru_ts3.22.2001.2010.pet.dat"
ncname<-"cru_ts3.23.1991.2000.pet.dat"
ncfname <- paste(ncname, ".nc", sep = "")
dname <- "pet"
ncin <- nc_open(ncfname)
print(ncin)
lon <- ncvar_get(ncin, "lon")
nlon <- dim(lon)
head(lon)
lat <- ncvar_get(ncin, "lat", verbose = F)
nlat <- dim(lat)
head(lat)
print(c(nlon, nlat))
t <- ncvar_get(ncin, "time")
tunits <- ncatt_get(ncin, "time", "units")
nt <- dim(t)
tmp.array <- ncvar_get(ncin, dname)
dlname <- ncatt_get(ncin, dname, "long_name")
dunits <- ncatt_get(ncin, dname, "units")
fillvalue <- ncatt_get(ncin, dname, "_FillValue")
dim(tmp.array)
title <- ncatt_get(ncin, 0, "title")
institution <- ncatt_get(ncin, 0, "institution")
datasource <- ncatt_get(ncin, 0, "source")
references <- ncatt_get(ncin, 0, "references")
history <- ncatt_get(ncin, 0, "history")
Conventions <- ncatt_get(ncin, 0, "Conventions")
nc_close(ncin)
# split the time units string into fields
tustr <- strsplit(tunits$value, " ")
tdstr <- strsplit(unlist(tustr)[3], "-")
tmonth = as.integer(unlist(tdstr)[2])
tday = as.integer(unlist(tdstr)[3])
tyear = as.integer(unlist(tdstr)[1])
chron(t, origin = c(tmonth, tday, tyear))
tmp.array[tmp.array == fillvalue$value] <- NA
length(na.omit(as.vector(tmp.array[, , 1])))
m <- 1
tmp.slice <- tmp.array[, , m]
library(RColorBrewer)
image(lon, lat, tmp.slice, col = rev(brewer.pal(10, "RdBu")))
grid <- expand.grid(lon = lon, lat = lat)
cutpts <- c(-50, -40, -30, -20, -10, 0, 10, 20, 30, 40, 50)
levelplot(tmp.slice ~ lon * lat, data = grid, at = cutpts, cuts = 11,
pretty = T,
col.regions = (rev(brewer.pal(10, "RdBu"))))
lonlat <- expand.grid(lon, lat)
tmp.vec <- as.vector(tmp.slice)
length(tmp.vec)
tmp.df01 <- data.frame(cbind(lonlat, tmp.vec))
names(tmp.df01) <- c("lon", "lat", paste(dname, as.character(m), sep = "_"))
head(na.omit(tmp.df01), 20)
csvfile <- "cru_tmp_1.csv"
write.table(na.omit(tmp.df01), csvfile, row.names = FALSE, sep = ",")
tmp.vec.long <- as.vector(tmp.array)
length(tmp.vec.long)
tmp.mat <- matrix(tmp.vec.long, nrow = nlon * nlat, ncol = nt)
dim(tmp.mat)
head(na.omit(tmp.mat))
lonlat <- expand.grid(lon, lat)
tmp.df02 <- data.frame(cbind(lonlat, tmp.mat))
names(tmp.df02) <- c("lon","lat","pet_jan_2001",
"pet_feb_2001",
"pet_mar_2001",
"pet_apr_2001",
"pet_may_2001",
"pet_jun_2001",
"pet_jul_2001",
"pet_aug_2001",
"pet_sep_2001",
"pet_oct_2001",
"pet_nov_2001",
"pet_dec_2001",
"pet_jan_2002",
"pet_feb_2002",
"pet_mar_2002",
"pet_apr_2002",
"pet_may_2002",
"pet_jun_2002",
"pet_jul_2002",
"pet_aug_2002",
"pet_sep_2002",
"pet_oct_2002",
"pet_nov_2002",
"pet_dec_2002",
"pet_jan_2003",
"pet_feb_2003",
"pet_mar_2003",
"pet_apr_2003",
"pet_may_2003",
"pet_jun_2003",
"pet_jul_2003",
"pet_aug_2003",
"pet_sep_2003",
"pet_oct_2003",
"pet_nov_2003",
"pet_dec_2003",
"pet_jan_2004",
"pet_feb_2004",
"pet_mar_2004",
"pet_apr_2004",
"pet_may_2004",
"pet_jun_2004",
"pet_jul_2004",
"pet_aug_2004",
"pet_sep_2004",
"pet_oct_2004",
"pet_nov_2004",
"pet_dec_2004",
"pet_jan_2005",
"pet_feb_2005",
"pet_mar_2005",
"pet_apr_2005",
"pet_may_2005",
"pet_jun_2005",
"pet_jul_2005",
"pet_aug_2005",
"pet_sep_2005",
"pet_oct_2005",
"pet_nov_2005",
"pet_dec_2005",
"pet_jan_2006",
"pet_feb_2006",
"pet_mar_2006",
"pet_apr_2006",
"pet_may_2006",
"pet_jun_2006",
"pet_jul_2006",
"pet_aug_2006",
"pet_sep_2006",
"pet_oct_2006",
"pet_nov_2006",
"pet_dec_2006",
"pet_jan_2007",
"pet_feb_2007",
"pet_mar_2007",
"pet_apr_2007",
"pet_may_2007",
"pet_jun_2007",
"pet_jul_2007",
"pet_aug_2007",
"pet_sep_2007",
"pet_oct_2007",
"pet_nov_2007",
"pet_dec_2007",
"pet_jan_2008",
"pet_feb_2008",
"pet_mar_2008",
"pet_apr_2008",
"pet_may_2008",
"pet_jun_2008",
"pet_jul_2008",
"pet_aug_2008",
"pet_sep_2008",
"pet_oct_2008",
"pet_nov_2008",
"pet_dec_2008",
"pet_jan_2009",
"pet_feb_2009",
"pet_mar_2009",
"pet_apr_2009",
"pet_may_2009",
"pet_jun_2009",
"pet_jul_2009",
"pet_aug_2009",
"pet_sep_2009",
"pet_oct_2009",
"pet_nov_2009",
"pet_dec_2009",
"pet_jan_2010",
"pet_feb_2010",
"pet_mar_2010",
"pet_apr_2010",
"pet_may_2010",
"pet_jun_2010",
"pet_jul_2010",
"pet_aug_2010",
"pet_sep_2010",
"pet_oct_2010",
"pet_nov_2010",
"pet_dec_2010")
options(width = 110)
head(na.omit(tmp.df02, 20))
dim(na.omit(tmp.df02))
csvfile <- "cru_tmp_2.csv"
write.table(na.omit(tmp.df02), csvfile, row.names = FALSE, sep = ",")
*2) translate a cvs-file with accompanying raster file to polygon regions.*
The "filename.txt" file should contain the variables: lon, latitude, and
all the monthly_yearly variables extracted from point 1 above.
The grid shapefile (*grid_025dd.shp*) can be found through the following
link but it is only an example and not the correct grid for the problem
above :
https://drive.google.com/folderview?id=0By9u5m3kxn9yfjZtdFZLcW82SWpzT1VwZXE1a3FtRGtSdEl1c1NvY205TGpack9xSFc2T2s&usp=sharing
# upload data
mydata<-read.table("filename.txt", header=TRUE,sep=",",dec=".")
# upload empty raster
library(rgdal)
# 40 seconds
grid <- readOGR(".", layer = "grid_025dd")
# concatenate data in R
# 2 seconds
mydata$lonlat<-do.call(paste, c(mydata[c("lon", "lat")], sep=""))
grid at data$lonlat<-do.call(paste, c(grid at data[c("LONGITUDE", "LATITUDE")],
sep=""))
# use common variable lonlat to merge data in raster
###### prepare shapefile #####
library(rgdal) ## Load geographic info
library(maps) ## Projections
library(maptools) ## Data management
#library(sm) ## Data management
library(spdep) ## Spatial autocorrelation
library(gstat) ## Geostatistics
library(splancs) ## Kernel Density
library(spatstat) ## Geostatistics
library(pgirmess) ## Spatial autocorrelation
library(RColorBrewer) ## Visualization
library(classInt) ## Class intervals
library(spgwr) ## GWR
# Match polygons with data
idx <- match(grid$lonlat, mydata$lonlat)
# Places without information
idxNA <- which(is.na(idx))
# Information to be added to the SpatialPolygons object
dat2add <- mydata[idx, ]
# spCbind uses row names to match polygons with data
# First, extract polygon IDs
IDs <- sapply(grid at polygons, function(x)x at ID)
# and join with the SpatialPolygons
row.names(dat2add) <- IDs
datPols <- spCbind(grid, dat2add)
# Drop those places without information
datPols <- datPols[-idxNA, ]
# write new shapefile
# 7 seconds
writeOGR(datPols, dsn = ".", layer ='sm2000eu28', driver = 'ESRI Shapefile')
# read new shapefile
# 51 seconds
data <- readOGR(".", layer="sm2000eu28")
############################
# intersect nuts with grid #
############################
library(rgdal)
nuts <- readOGR(".", layer = "NUTS_RG_60M_2006")
library(rgeos)
proj4string(data) <- CRS("+proj=longlat +ellps=WGS84 +datum=WGS84")
#
grid <- data
grid at data$lonlat <- NULL
grid at data$lonlat_1 <- NULL
grid at data$ID <- NULL
grid at data$lat <- NULL
grid at data$lon <- NULL
grid at data$ELEVATION <- NULL
grid at data$DAYS_RAIN_ <- NULL
# First find out which grid cells intersect your NUTS polygons
grid_nuts <- gIntersects(grid,nuts,byid = TRUE)
# use the apply() function to calculate the mean, min, and max of your
value.
# The loop makes
for(i in names(grid at data)){
nuts at data[[paste(i, 'average_value', sep="_")]] <-
apply(grid_nuts,1,function(x) mean(grid at data[[i]][x]))
nuts at data[[paste(i, 'min_value', sep="_")]] <-
apply(grid_nuts,1,function(x) min(grid at data[[i]][x]))
nuts at data[[paste(i, 'max_value', sep="_")]] <-
apply(grid_nuts,1,function(x) max(grid at data[[i]][x]))
}
write.table(nuts at data, "nuts_sm2000eu28_unweighted.txt", sep="\t")
[[alternative HTML version deleted]]
_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo
We value your opinion. Please take a few minutes to shar...{{dropped:5}}
More information about the R-sig-Geo
mailing list