[R-sig-Geo] Error in netCDF file: cells are not equally spaced

Roy Mendelssohn - NOAA Federal roy@mende|@@ohn @end|ng |rom no@@@gov
Thu Aug 22 23:52:18 CEST 2019


Hi  Frederico:

There are two separate issues here from my point of view.  The first is are both files CF compliant,  and as far as I can tell they both are, and applications like Panoply or ncdf4 that understand CF compliant files read them just fine.

The second issue is why can't raster or stars or gdal read them.  That is internal to those codes,  and people who know those packages better can probably answer that. And while I may appear to be beating a dead horse,  for better or worse CF compliant netcdf-3 or netcdf-4 files (and this includes files that follow the Discrete Sampling Geometry guidelines) are what all of the major satellite data centers,  the major climate and oceanographic and meteorological data centers, and most modeling efforts  (such as ROMS and IPCC) store their output.   These are standards in these communities,  and it would really benefit the R community if packages aimed at these types of data make certain they can read compliant datasets (of course this is easy for me to say because I won't be doing the work!).  Also the Python programs I have also read these files just fine, but that is probably because the Python codes have been developed more internal to the particular community around netcdf files and CF conventions.

I have found I get better results if I first read the file using ncdf4,  and then if I want to use the features of raster (or some other related program), I use one of the raster commands  to convert arrays to raster objects.

HTH,

-Roy



> On Aug 22, 2019, at 2:32 PM, Frederico Faleiro <fvfaleiro using gmail.com> wrote:
> 
> Dear list,
> 
> I do not understand why only this model have this issue. I can open many
> other files from other models without any issue in raster package.
> 
> *Roy*, I test with different model (i.e GFDL-CM3) from NOAA (
> https://drive.google.com/file/d/1GrfvbRSH_FpDmlRrqNGXqn6Tz13fjhB6/view?usp=sharing)
> without any problem as you can check in the code below. I find the same
> issue pointed by *Edzer* in the GFDL-ESM2G, but not in the GFDL-CM3.
> 
> Maybe the others can test with this file to figure out what is the problem
> with first one.
> 
> If the problem is only the difference in the interval of the cells, Is
> there any problem to use the raster in this case?
> The ncdf4 package can read it better, but I can not do other simple
> analysis like cell average, for example. Is it possible read it and make
> some kind of transformation to better use it in raster package?
> 
> # example
> library(ncdf4)
> library(raster)
> # open with ncdf4
> x.n <- nc_open('pr_Amon_GFDL-ESM2G_rcp45_r1i1p1_204101-204512.nc')
> z.n <- nc_open('pr_Amon_GFDL-CM3_rcp45_r1i1p1_204101-204512.nc')
> # open with raster
> x.r <- brick('pr_Amon_GFDL-ESM2G_rcp45_r1i1p1_204101-204512.nc',
> stopIfNotEqualSpaced = F)
> z.r <- brick('pr_Amon_GFDL-CM3_rcp45_r1i1p1_204101-204512.nc')
> 
> # get coordenates
> lat.x <- ncvar_get(x.n,"lat")
> lon.x <- ncvar_get(x.n,"lon")
> coords.xn <- as.matrix(expand.grid(lon.x,lat.x))
> coords.xr <- coordinates(x.r) # different order of the xy in netcdf4
> lat.z <- ncvar_get(z.n,"lat")
> lon.z <- ncvar_get(z.n,"lon")
> coords.zn <- as.matrix(expand.grid(lon.z,lat.z))
> coords.zr <- coordinates(z.r) # different order of the zz in netcdf4
> 
> # check the differences between the coordenates
> diffLatLon <- function(x) {
>  n <- length(x)-1
>  v <- rep(NA, n)
>  for(i in 1:n) {
>    v[i] <- x[i] - x[i+1]
>  }
> return(v)
> }
> 
> diffLatLon(lat.x)  # in this file only the first and last intervals differ
> from the rest.
> diffLatLon(lon.x) # the same interval in all longitudes
> diffLatLon(lat.z)  # the same interval in all latitudes
> diffLatLon(lon.z) # the same interval in all longitudes
> 
> # plot the average of the layers in raster
> plot(mean(x.r))
> plot(mean(z.r))
> 
> Best regards,
> 
> Frederico
> 
> On Thu, Aug 22, 2019 at 1:29 PM Edzer Pebesma <edzer.pebesma using uni-muenster.de>
> wrote:
> 
>> Thanks!
>> 
>> stars::read_stars (GDAL) won't read coordinates or coordinate reference
>> system from this file, confirmed by running gdalinfo on it.
>> 
>> stars::read_ncdf (installing stars from github) reads it like this:
>> 
>> library(stars)
>> # Loading required package: abind
>> # Loading required package: sf
>> # Linking to GEOS 3.7.1, GDAL 2.4.2, PROJ 5.2.0
>> (r0 = read_ncdf("pr_Amon_GFDL-ESM2G_rcp45_r1i1p1_204101-204512.nc"))
>> # no 'var' specified, using pr
>> # other available variables:
>> #  average_DT, average_T1, average_T2, lat, lon, bnds, time, time_bnds,
>> lat_bnds, lon_bnds
>> # No projection information found in nc file.
>> #  Coordinate variable units found to be degrees,
>> #  assuming WGS84 Lat/Lon.
>> # stars object with 3 dimensions and 1 attribute
>> # attribute(s):
>> #  pr [kg/m^2/s]
>> #  Min.   :0.000e+00
>> #  1st Qu.:6.035e-06
>> #  Median :1.700e-05
>> #  Mean   :2.799e-05
>> #  3rd Qu.:3.575e-05
>> #  Max.   :1.068e-03
>> # dimension(s):
>> #      from  to offset delta                       refsys point
>> # lon     1 144     NA    NA +proj=longlat +datum=WGS8...    NA
>> # lat     1  90     NA    NA +proj=longlat +datum=WGS8... FALSE
>> # time    1  60     NA    NA                        PCICt FALSE
>> #                                                   values
>> # lon                              [0,2.5),...,[357.5,360) [x]
>> # lat                    [-90,-88.98876),...,[88.98876,90) [y]
>> # time [2041-01-01,2041-02-01),...,[2045-12-01,2046-01-01)
>> st_get_dimension_values(r0, "lat") %>% diff %>% table
>> # .
>> # 1.51685393258427 2.02247191011234 2.02247191011236 2.02247191011237
>> #                2                1               74               12
>> 
>> where essentially only the first and last intervals differ from the rest.
>> 
>> So, yes, latitudes are not regular. Also, time is PCICt, and uses a 365
>> day calendar without leap years:
>> 
>> st_get_dimension_values(r0, "time") %>% diff %>% table
>> # .
>> # 28 30 31
>> #  5 20 34
>> 
>> 
>> 
>> 
>> 
>> On 8/22/19 8:55 PM, Frederico Faleiro wrote:
>>> Dear list,
>>> sorry about the file. The files from the CMIP5 are public, but need a
>>> registration first.
>>> You can access the same file in any of these links:
>>> https://www.sendspace.com/file/famgz3 and
>>> 
>> https://drive.google.com/file/d/1L1T03iQ6LU1yYRYeRypxwMB3E7fLjkJI/view?usp=sharing
>>> .
>>> 
>>> In this meantime I will try the Mike' advices.
>>> 
>>> Best regards,
>>> 
>>> Frederico <https://www.sendspace.com/file/famgz3>
>>> 
>>> On Thu, Aug 22, 2019 at 3:41 PM Frederico Faleiro <fvfaleiro using gmail.com>
>>> wrote:
>>> 
>>>> Dear list,
>>>> 
>>>> sorry about the file. The files from the CMIP5 are public, but need a
>>>> registration first. I am sending the same file attached.
>>>> 
>>>> In this meantime I will try the Mike' advices.
>>>> 
>>>> Best regards,
>>>> 
>>>> Frederico
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> On Thu, Aug 22, 2019 at 5:03 AM Michael Sumner <mdsumner using gmail.com>
>> wrote:
>>>> 
>>>>> raster does the wrong thing here in my opinion, the right outcome is to
>>>>> give a grid in index-extent (0, ncol, 0, nrow) and with no projection
>>>>> metadata (crs). There will be coordinate arrays in this file, and they
>> need
>>>>> to be handled as though they are data (with local, x*y dependent
>> values in
>>>>> every cell).
>>>>> 
>>>>> With brick() you can proceed to process the data with no problem (using
>>>>> stopIfNotEqualSpaced = FALSE), but you can't rely on the extent being
>>>>> relevant, or any function that works in geographic space to do the
>> right
>>>>> thing. To map a layer of these data you might try
>>>>> 
>>>>> grd <- raster(yourfile, stopIfNotEqualSpaced = FALSE)
>>>>> coords <- brick(raster(yourfile, varname = "lon"),
>>>>>                         raster(yourfile, varname = "lat"))   ## but
>> only
>>>>> you will know the names of these variables values "lon", "lat" (might
>> be
>>>>> "TLON", "TLAT" ? )
>>>>> 
>>>>> In quadmesh there is a way to plot in geographic space that's not
>>>>> impossibly slow (possibly the coords will need a re-orientation
>> transpose
>>>>> ...):
>>>>> 
>>>>> quadmesh::mesh_plot(grd, coords = coords)
>>>>> 
>>>>> It's likely that will "map" it properly, but CMIP is likely to wrap
>>>>> around an odd longitude (or something), and so cropping to your area
>> and/or
>>>>> choosing a good project for the crs argument to mesh_plot is probably
>>>>> needed.  You can investigate with plot(coords[[1]]) and
>> plot(coords[[2]])
>>>>> to get a sense of their layout.
>>>>> 
>>>>> In the stars package this is all internalized, with
>>>>> stars::read_stars(yourfile) catching all the information required as
>> much
>>>>> as possible, and with plotting methods - but variable choice and actual
>>>>> results vary widely, and depend a lot on very specific details.
>> (angstroms
>>>>> package has similar helpers for the approach but is unlikely to help
>> here
>>>>> without access to the file to try)
>>>>> 
>>>>> To help us guess further you can use the "ncdump" output, raster will
>>>>> print this with
>>>>> 
>>>>> print(raster("yourfile"))
>>>>> 
>>>>> and from that we could provide better guesses at variable names and
>>>>> subsetting etc.
>>>>> 
>>>>> But, as Edzer asked, nothing is as good as having the file - any chance
>>>>> you can share it?  (Personally I think the world rather needs more R
>>>>> attention on climate model output.)
>>>>> 
>>>>> Cheers, Mike.
>>>>> 
>>>>> On Thu, Aug 22, 2019 at 1:53 PM Frederico Faleiro <fvfaleiro using gmail.com
>>> 
>>>>> wrote:
>>>>>> 
>>>>>> Dear list,
>>>>>> 
>>>>>> I am using some  netCDF files from the CMIP5 climate models in raster
>>>>>> package, but I am having some issues with one model.
>>>>>> The netCDF file from the GFDL-ESM2G model (e.g.
>>>>>> 
>>>>> 
>> http://aims3.llnl.gov/thredds/fileServer/css03_data/cmip5/output1/NOAA-GFDL/GFDL-ESM2G/rcp45/mon/atmos/Amon/r1i1p1/v20120412/pr/pr_Amon_GFDL-ESM2G_rcp45_r1i1p1_204101-204512.nc
>>>>>> )
>>>>>> have the error message as in bellow example.
>>>>>> 
>>>>>> #Example
>>>>>> library(raster)
>>>>>> s1 <- stack("pr_Amon_GFDL-ESM2G_rcp45_r1i1p1_204101-204512.nc")
>>>>>> Error in .rasterObjectFromCDF(x, type = objecttype, band = band, ...)
>> :
>>>>>>  cells are not equally spaced; you should extract values as points
>>>>>> 
>>>>>> # I check some solutions that recomend force read the file with the
>>>>>> argument "stopIfNotEqualSpaced = F" as bellow.
>>>>>> sf <- stack("pr_Amon_GFDL-ESM2G_rcp45_r1i1p1_204101-204512.nc",
>>>>>> *stopIfNotEqualSpaced
>>>>>> = F*)
>>>>>> bf <- brick("pr_Amon_GFDL-ESM2G_rcp45_r1i1p1_204101-204512.nc",
>>>>>> *stopIfNotEqualSpaced
>>>>>> = F*)
>>>>>> 
>>>>>> I performed some tests and only "works" with brick. However I did not
>>>>> find
>>>>>> any solution to check where is the problem and fix it in the file.
>>>>>> 
>>>>>> How can I check if it is really an issue and fix it?
>>>>>> 
>>>>>> sessionInfo()
>>>>>> R version 3.5.1 (2018-07-02)
>>>>>> Platform: x86_64-w64-mingw32/x64 (64-bit)
>>>>>> Running under: Windows 10 x64 (build 18362)
>>>>>> 
>>>>>> Matrix products: default
>>>>>> 
>>>>>> locale:
>>>>>> [1] LC_COLLATE=Portuguese_Brazil.1252  LC_CTYPE=Portuguese_Brazil.1252
>>>>>> [3] LC_MONETARY=Portuguese_Brazil.1252 LC_NUMERIC=C
>>>>>> [5] LC_TIME=Portuguese_Brazil.1252
>>>>>> 
>>>>>> attached base packages:
>>>>>> [1] stats     graphics  grDevices utils     datasets  methods   base
>>>>>> 
>>>>>> other attached packages:
>>>>>> [1] raster_2.8-19 sp_1.3-1
>>>>>> 
>>>>>> loaded via a namespace (and not attached):
>>>>>> [1] compiler_3.5.1   rgdal_1.4-3      Rcpp_1.0.1
>> codetools_0.2-15
>>>>>> ncdf4_1.16.1
>>>>>> [6] grid_3.5.1       lattice_0.20-35
>>>>>> 
>>>>>> Best regards,
>>>>>> 
>>>>>> Frederico
>>>>>> 
>>>>>>        [[alternative HTML version deleted]]
>>>>>> 
>>>>>> _______________________________________________
>>>>>> R-sig-Geo mailing list
>>>>>> R-sig-Geo using r-project.org
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>>> 
>>>>> 
>>>>> 
>>>>> --
>>>>> Michael Sumner
>>>>> Software and Database Engineer
>>>>> Australian Antarctic Division
>>>>> Hobart, Australia
>>>>> e-mail: mdsumner using gmail.com
>>>>> 
>>>> 
>>> 
>>>      [[alternative HTML version deleted]]
>>> 
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo using r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>> 
>> 
>> --
>> Edzer Pebesma
>> Institute for Geoinformatics
>> Heisenbergstrasse 2, 48151 Muenster, Germany
>> Phone: +49 251 8333081
>> _______________________________________________
>> R-sig-Geo mailing list
>> R-sig-Geo using r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>> 
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

**********************
"The contents of this message do not reflect any position of the U.S. Government or NOAA."
**********************
Roy Mendelssohn
Supervisory Operations Research Analyst
NOAA/NMFS
Environmental Research Division
Southwest Fisheries Science Center
***Note new street address***
110 McAllister Way
Santa Cruz, CA 95060
Phone: (831)-420-3666
Fax: (831) 420-3980
e-mail: Roy.Mendelssohn using noaa.gov www: https://www.pfeg.noaa.gov/

"Old age and treachery will overcome youth and skill."
"From those who have been given much, much will be expected" 
"the arc of the moral universe is long, but it bends toward justice" -MLK Jr.



More information about the R-sig-Geo mailing list