[R-sig-Geo] Automate the extraction of climate variable at different time depths from netcdf in r

rubenfcasal rubenfcasal at gmail.com
Mon Sep 8 12:23:23 CEST 2014


Hello,

     I use the attached script to extract climate data from the OISST 
sea surface temperature netCDF "sst.wkmean.1990-present.nc". This script 
is a slight modification of one written by Luke Miller (downloaded from 
his web page http://lukemiller.org, some years ago...  ) to convert the 
data to sp/spacetime classes (easily to handle, subset,...).

     I just realised that Luke Miller has updated the original script. 
See: 
http://lukemiller.org/index.php/2014/01/noaa-oisst-v2-high-resolution-daily-sea-surface-temperatures-with-r/

     I hope it helps...

     Best Regards, Ruben FC


El 06/09/2014 19:10, Barnabas Daru escribió:
> Dear all,
> I write to seek help with extracting climate data at various times and
> depth from NETCDF in R.
> The climate data is sea surface temperature monthly means from 1854 to
> 2014.
> My goal is to automate the extraction of SST climate data for a set of
> points with different time depths.
>
> I have successfully used the 'brick' function in the raster package to get
> climate data at various time depths as follows:
>
> b <- brick("~sst.mnmean.nc", varname="sst")
>
> mydate <- b[["X1869.09.01"]] # SST for September in 1869
>
> mydata <- read.csv("~Species one.csv")
>
> mydataSPDF <- SpatialPointsDataFrame(mydata[,5:6],data.frame(mydata))
>
> extract.mydata <- extract(mydate, mydataSPDF, sp=TRUE)
>
> extract.mydata <- data.frame(extract.mydata)
>
> write.csv(extract.mydata, file = "Species_one_extracted.csv")
>
> My major challenge is that I have to keep extracting manually for each time
> depth, then copy and paste in the main dataframe and the data has about
> 3000 observations, meaning I have to keep copying and pasting ~3000 times
> for one species!
> Is there a way I can iterate the process in R to loop through each month's
> climate and extract the data for each point and time depth rather than to
> do it manually?
>
> Here's how my data frame looks like:
>    Date year month day lon lat SST  1855-01-01 1855 1 1 11.6861 57.7254
> 3.440000057  1859-07-26 1859 7 26 25.46122 60.26766 13.25  1860-01-01 1860 1
> 1 10.9154 53.9484 15.10999966  1861-07-10 1861 7 10 7.79588 58.08673
> 15.71000004  1861-07-26 1861 7 26 -2.84072 54.0778  1861-08-17 1861 8 17
> 11.9792 57.51298  1862-01-01 1862 1 1 22.20467 60.27955  1862-08-05 1862 8 5
> 11.78316 57.61649  1862-08-23 1862 8 23 11.9792 57.51298  1863-08-26 1863 8
> 26 10.72237 59.97258  1863-08-28 1863 8 28 15.53721 56.20091  1863-09-22
> 1863 9 22 16.37849 56.84255  1864-07-05 1864 7 5 -4.07097 51.09542
> 1864-07-18 1864 7 18 -4.21368 51.0928  1864-08-26 1864 8 26 -4.07097
> 51.09542  1865-09-07 1865 9 7 10.76144 59.91271  1865-09-27 1865 9 27
> 10.53107 60.43395  1867-08-28 1867 8 28 12.82669 55.86822  1868-01-01 1868 1
> 1 12.8944 56.6897  1868-01-01 1868 1 1 4.82685 53.13793
> Thanks and kind regards
> Barnabas Daru
>

-------------- next part --------------
# Filename: NOAA_OISST_extraction.R
# 
# Author: Luke Miller   Mar 1, 2011
# http://lukemiller.org/
# Modified by: Ruben Fernandez-Casal Ap 20, 2012 (sp, spacetime classes)
###############################################################################
# library(sp)
library(spacetime)
library(ncdf)  # NetCDF (Network Common Data Form) http://www.unidata.ucar.edu/software/netcdf/

# For info on the NOAA Optimum Interpolated Sea Surface Temperature V2 (OISST):
# http://www.esrl.noaa.gov/psd/data/gridded/data.noaa.oisst.v2.html

# This script works on the OISST sea surface temperature netCDF file called 
# sst.wkmean.1990-present.nc, which must be downloaded into the current 
# working directory (file size is >135MB currently)

# The OISST grid layout is 1 degree latitudes from 89.5 to -89.5 degrees north.
# The longitude grid is 1 degree bins, 0.5 to 359.5 degrees east.
# The SST values are given in degrees Celsius.
# Time values are "days since 1800-1-1 00:00", starting from 69395
# Time delta_t is 0000-00-07 00:00, indices range from 0 to 1103 currently
# The time values in the sst field give the 1st day of the week for each 
# weekly temperature value (though the data are meant to be centered on 
# Wednesday of that week), starting at 69395 = 12/31/1989.
# Use as.Date(69395,origin = '1800-1-1') to convert the netCDF day value to a 
# human readable form. 

lats = seq(89.5,-89.5,-1) #generate set of grid cell latitudes (center of cell)
lons = seq(0.5,359.5,1) #generate set of grid cell longitudes (center of cell)

##Ask user to enter the boundaries of the search area
#cat("Enter longitude of western edge of search area (degrees east 0 to 359)\n")
#lon1 = scan(what = numeric(0),n = 1)
#cat("Enter longitude of eastern edge of search area (degrees east 0 to 359)\n")
#lon2 = scan(what = numeric(0),n = 1)
#cat("Enter latitude of northern edge\n")
#cat("of search area (degrees north, 89.5 to -89.5\n")
#lat1 = scan(what = numeric(0),n = 1)
#cat("Enter latitude of southern edge\n")
#cat("of search area (degrees north, 89.5 to -89.5\n")
#lat2 = scan(what = numeric(0),n = 1)

lon1 = 0
lon2 = 360
lat1 = 90
lat2 = -90

#lon1 = -11 + 360
#lon2 = -7 + 360
#lat1 = 45
#lat2 = 40

lon1a = which.min(abs(lon1 - lons)) #get index of nearest longitude value
lon2a = which.min(abs(lon2 - lons)) #get index of nearest longitude value
lat1a = which.min(abs(lat1 - lats)) #get index of nearest latitude value
lat2a = which.min(abs(lat2 - lats)) #get index of nearest latitude value
#The lon/lat 1a/2a values should now correspond to indices in the netCDF file
#for the desired grid cell. 
nlon = (lon2a - lon1a) + 1 #get number of longitudes to extract
nlat = (lat2a - lat1a) + 1 #get number of latitudes to extract

##Ask the user to enter the dates of interest
#cat("Enter the starting date in the form: 1990-1-31\n")
#date1 = scan(what = character(0),n = 1)
#cat("Enter the ending date in the form: 1990-1-31\n")
#date2 = scan(what = character(0),n = 1)

date1 = '2012-04-15'  # date1 = '2012-02-05'
date2 = '2012-04-15'

date1 = as.Date(date1, format = "%Y-%m-%d") #convert to Date object
date2 = as.Date(date2, format = "%Y-%m-%d") #convert to Date object

#Open the netCDF file for reading. 
nc = open.ncdf("sst.wkmean.1990-present.nc")
#print.ncdf(nc) will show info about the structure of the netCDF file

#Extract available dates from netCDF file
ncdates = nc$dim$time$vals
ncdates = as.Date(ncdates,origin = '1800-1-1') #available time points in nc

date1a = which.min(abs(date1 - ncdates)) #get index of nearest time point
date2a = which.min(abs(date2 - ncdates)) #get index of nearest time point
ndates = (date2a - date1a) + 1 #get number of time steps to extract

#Extract the data from the netCDF file to a matrix or array called 'sstout'. 
sstout = get.var.ncdf(nc, varid = 'sst', start = c(lon1a,lat1a,date1a),
		count = c(nlon,nlat,ndates))
#If you only retrieve one time point, sstout will be a 2D matrix with 
#longitudes running down the rows, and latitudes across the columns. 
#For example, Column 1, sstout[1:nrow(sstout),1], will contain sst values for 
#each of the longitude values at the northern-most latitude in your search 
#area, with the first row, sstout[1,1], being the western-most longitude, and 
#the last row being the eastern edge of your search area.
#If you retrieve multiple time points, sstout will be a 3D array, where time is
#the 3rd dimension. Lat/lon will be arranged the same as the 2D case. 

#The vector 'datesout' will hold the Date values associated with each set of 
#sst values in the sstout array, should you need to access them.
datesout = ncdates[date1a:date2a]

###############################################################################
###############################################################################
# The NOAA OISST files contain sea surface temperatures for the entire globe,
# including on the continents. This clearly isn't right, so they also supply a
# land-sea mask file in netCDF format at the website listed at the start of 
# this script. 
# We use the values (0 or 1) stored in the mask file to turn all of the 
# continent areas into NA's. 

nc2 = open.ncdf('lsmask.nc') #open land-sea mask
mask = get.var.ncdf(nc2, varid = "mask",start = c(lon1a,lat1a,1),
		count = c(nlon,nlat,1)) #get land-sea mask values (0 or 1)

mask = ifelse(mask == 0,NA,1) #replace 0's with NA's

goodlons = lons[lon1a:lon2a]
goodlats = lats[lat1a:lat2a]
index <- goodlons > 191 # Atlantic "view"
# index <- FALSE for Pacific "view"
goodlons[index] <- goodlons[index] - 360

llCRS <- CRS("+proj=longlat +ellps=WGS84")
cellcentre <- c(min(goodlons),min(goodlats))
cellsize <- c(1,1)
sstout.grid <- GridTopology( cellcentre.offset=cellcentre, cellsize=cellsize, cells.dim=c(length(goodlons),length(goodlats)))

jet.colors <- colorRampPalette(c("#00007F", "blue", "#007FFF", "cyan", "#7FFF7F", "yellow", "#FF7F00", "red", "#7F0000"))

if (is.matrix(sstout)) { 
#if there is only 1 time point (2D matrix) -> sp::SpatialGridDataFrame
	sstout = sstout * mask #all masked values become NA	
  sstdf <- data.frame(sst=as.numeric(rbind(sstout[index,],sstout[!index,])))
  attr(sstdf,"label") <- datesout
  sstsp <-  SpatialGridDataFrame(sstout.grid, sstdf, proj4string=llCRS)
  # save(sstsp,file="sstsp.rda")
  image(sstsp, col=jet.colors(128), axes=TRUE)
  title(attr(sstsp at data,"label"))
  # Importar datos de paquetes
  #  library(maps)
  #  library(maptools)
  #  world <- map("world", fill=TRUE, plot = FALSE)   # Hay un mapa con mayor resolución en mapdata::worldHires
  #  world_pol <- map2SpatialPolygons(world, world$names, CRS("+proj=longlat +ellps=WGS84"))
  #  plot(world_pol, col='white', add=TRUE)
} else {
#if ssout is a 3D matrix  -> spacetime::STFDF
	dims = dim(sstout)
	sstdf <- c()
	for (i in 1:dims[3]){
		sstout[,,i] = sstout[,,i] * mask #all masked values become NA
		sstdf <- cbind(sstdf, as.numeric(rbind(sstout[index,,i],sstout[!index,,i])))
	}
  stfdf <- STFDF(SpatialGrid(sstout.grid, proj4string=llCRS), as.POSIXct(datesout), 
                data.frame(sst=as.vector(sstdf)))
  # save(stfdf,file="sstst.rda")
  library(lattice)
  stplot(stfdf[,c(1,2,dims[3]-1,dims[3])],col.regions=jet.colors(128))

}


More information about the R-sig-Geo mailing list