[R-sig-Geo] Suggestions for irregular grid netCDF data?

Michael Treglia mtreglia at gmail.com
Fri Mar 3 05:11:07 CET 2017


Hi again,

Just to follow-up - got this working without too much issue... as of now, I
have the data values associated w/ the center lat/long of each grid cell.
There are also lat/long vertices for polygons associated with each grid
cell, but that'll take more working through it to get it together. However,
for now this largely suits my needs, and for now I can approximate the grid
cells using voronoi polygons for the grid cell centroids.

Here's some working code [I stop at creating dataframes, but going the next
steps for spatial objects is pretty easy - either sp or simple features
(still learning the latter)].

Thanks again!
Mike

#Sample Code
#Gets netCDF data for lat/long points
library(ncdf4)

#Connect to netCDF
nyhops.nc <- nc_open("
http://colossus.dl.stevens-tech.edu/thredds/dodsC/LISS/Hindcast/all_nb_mon_81.nc
")

##To get 'simple', 3D variale (e.g., depth)
#Load depth, lat, and long
depth.nc <- ncvar_get(nyhops.nc, 'depth')
lat.nc <- ncvar_get(nyhops.nc, 'lat')
long.nc <- ncvar_get(nyhops.nc, 'lon')

#Create dataframe
depth.df <- data.frame(lat=as.vector(lat.nc), long=as.vector(long.nc),
depth=as.vector(depth.nc))

#Remove NA values
depth.df <- depth.df[!is.na(depth.df$depth),]
#Can then convert to spatial object, export as table, etc.

##Example getting data involving a vertical horizon dimension and a date
dimension
#Fetch the temperature data; start is the starting position for x, y,
sigma, date; count is how many
# here, all rows & columns, and one horizon (horizon 3) and date (394)
temp.nc <- ncvar_get(nyhops.nc, 'temp', start=c(1,1,3,394),
count=c(147,452,1,1))

#Can Find out how many dimensions and entries:
nyhops.nc$var[['temp']]$varsize
#Output: [1] 147 452  11 394
#147 rows; 452 columns; 11 horizons; 394 dates/times

#Create dataframe w/ lat, long, temp
temp.df <- data.frame(cbind(long=as.vector(long.nc), lat=as.vector(lat.nc),
temp=as.vector(temp.nc)))

#remove NA rows for temp
temp.df <- temp.df[!is.na(temp.df$temp),] #Can conver to spatial object,
export as table, etc.

##Can plot, just as a non-spatial object for now.
#assign color ramp
rbPal <- colorRampPalette(c('blue',  'red'))

#Adds a column of color values based on the temp values
temp.df$Col <- rbPal(10)[as.numeric(cut(temp.df$temp,breaks = 10))]

plot(temp.df$long, temp.df$lat, col=temp.df$Col)


##Next step - use lon_vertices and lat_vertices to get polygons associated
with each point


On Thu, Mar 2, 2017 at 8:58 AM, Michael Treglia <mtreglia at gmail.com> wrote:

> Thanks Mike!
>
> This is helpful, so far at least to see/think about different ways to work
> with the data, as I wade through this structure and understand how
> different R packages handle it.
>
> I'll give things a go, and follow up, and will keep an eye out for any
> other suggestions. And if you catch anything I'm missing if you go through
> the example code, I'm all ears.  This isn't pressing on my end, but working
> through this dataset (and similarly structured ones) will ultimately be
> really useful.
>
> Thanks again,
> Mike T
>
> Please pardon any typos, this message was sent from a mobile device.
>
> On Mar 1, 2017 11:43 PM, "Michael Sumner" <mdsumner at gmail.com> wrote:
>
>> Hi there,
>>
>> my general advice is
>>
>> * use raster to read the data layers (including the coordinates, which in
>> this context are just data)
>> * treat each layer "index space", i.e. setExtent(r, extent(0, nrow(x), 0,
>> ncol(x)) (or possibly transpose for some NetCDF files)
>> * use raster's cell-index logic to do extractions and aggregations in
>> that index space, it's neat because cropping inherits the parent
>> georeference so it's "global"
>> * use resampling to put real-world data objects into that index space for
>> extractions (you have to worry about the resolution and the boundary to do
>> that)
>>
>> I have a few R packages that do this for ROMS and CMIP5 that can be
>> generally applied, but there not in great shape atm.
>>
>> This rough document shows some of what you can do, it was written for a
>> particular purpose I had exploring leaflet which is on the backburner for
>> now, so it doesn't really go anywhere.
>>
>> http://rpubs.com/cyclemumner/roms0
>>
>> I'm happy to help you get what you need though, we deal with scads of
>> this kind of data and it doesn't fit any good mould except the "code it
>> yourself" club. It's not a huge amount of code but the "many coordinate
>> systems" can be bewildering and it's really specifically bound to the
>> intricacies of raster and sp and rgdal and ncdf4.
>>
>> I'll try to look at your examples sometime soon.
>>
>> Cheers, Mike.
>>
>>
>> On Thu, 2 Mar 2017 at 14:53 Michael Treglia <mtreglia at gmail.com> wrote:
>>
>>> ​Hi all,
>>>
>>> I've been trying to work with a somewhat complex netCDF dataset, and
>>> wanted
>>> to see if anybody has suggestions.
>>>
>>> The data are for marine variables in the form of a remote netCDF served
>>> through OPENDAP. One of the complexities is that it comes in an irregular
>>> grid, as fine as 25m and as coarse as 400m (coarser as you move further
>>> from the coast) - this variable spacing makes it tough to work with in
>>> the
>>> raster package, which seems like it would be the easiet.  Is there a
>>> straightforward, relatively simple way to work with a dataset like this,
>>> getting entire layers at a time? The data have a slew of different
>>> variables, some of which also have time steps and different depth
>>> horizons.
>>>
>>> So far, I’ve been able to connect to the data using the ncdf4 package in
>>> R,
>>> and load simple, 3 dimensional data (e.g., depth). I can also force
>>> layers
>>> to load using the raster package, adding the argument
>>> ‘stopIfNotEqualSpaced=FALSE', though that obviously has issues with the
>>> irregular grid (though if it worked, would make it easier to pull
>>> information by date and depth horizons). I'm having a tough time getting
>>> data via the ncdf4 package for specific dates/depth horizons though.
>>> [note:
>>> I end up having to use R in a linux environment for this, as I understand
>>> the windows implementation for is more limited, and throws an ‘Unknown
>>> file
>>> format’ error.]
>>>
>>> The data are here:
>>> http://colossus.dl.stevens-tech.edu/thredds/dodsC/LISS/Hindc
>>> ast/all_nb_mon_81.nc,
>>> and here’s the web-interface for the data:
>>> http://colossus.dl.stevens-tech.edu/thredds/dodsC/LISS/Hindc
>>> ast/all_nb_mon_81.nc.html.
>>> I've included some sample R code below - and am totally open to
>>> alternative
>>> ways that might be better for working with these data [still new-ish to
>>> netCDFs, and have mainly dealt with much simpler, soil or climate
>>> datasets].
>>>
>>> Thanks!
>>> Mike
>>>
>>>
>>> Sample R Code:
>>> #Load and work w/ data in ncdf4
>>> library(ncdf4)
>>>
>>> nyhops.nc <- nc_open("
>>> http://colossus.dl.stevens-tech.edu/thredds/dodsC/LISS/Hindc
>>> ast/all_nb_mon_81.nc
>>> ")
>>>
>>> #Get Point Data with Depth Measurements
>>> depth.nc <- ncvar_get(nyhops.nc, 'depth')
>>> lat.nc <- ncvar_get(nyhops.nc, 'lat')
>>> long.nc <- ncvar_get(nyhops.nc, 'lon')
>>>
>>> depth.df <- data.frame(lat=as.vector(lat.nc), long=as.vector(long.nc),
>>> depth=as.vector(depth.nc)) #Can then be exported, or converted to
>>> spatial
>>> object
>>>
>>> #Trying to get the entirety of a single time point and depth horizon, but
>>> not quite getting the start/count args - this gets a single data point it
>>> seems
>>> temp.nc <- ncvar_get(nyhops.nc, 'temp', start=c(10,10,1,1),
>>> count=c(1,1,1,1))
>>>
>>>
>>>
>>> #Load layers w/ raster package.
>>> library(raster)
>>> nyhops.url <- "
>>> http://colossus.dl.stevens-tech.edu/thredds/dodsC/LISS/Hindc
>>> ast/all_nb_mon_81.nc
>>> "
>>> depth <- raster(nyhops.url, varname="depth", lvar=1, level=1,
>>> stopIfNotEqualSpaced=FALSE) #Forcing it to read layer despite unequal
>>> spacing of coords
>>> plot(depth) #Plot looks distorted, as expected given irregular grid
>>> spacing. [I realize the data aren't projected at this point]
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>
>> --
>> Dr. Michael Sumner
>> Software and Database Engineer
>> Australian Antarctic Division
>> 203 Channel Highway
>> Kingston Tasmania 7050 Australia
>>
>>

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list