[R] ncdf4: Why are NAs converted to _FillValue when saving?

raphael.felber at agroscope.admin.ch raphael.felber at agroscope.admin.ch
Tue Aug 15 09:53:20 CEST 2017


Dear Dave

Thanks a lot for your answer. I agree that it is more an R issue than a package issue. But it's the first time I encountered such a problem.

For my R version (v3.4.1) on x86_64-w64-mingw32 the second part of your answer only holds for data_temp2; if I do any manipulation to data_temp2 before using ncvar_put(…, data_temp) then data_temp2 remains. However this doesn't hold for data_temp; after using ncvar_put(…, data_temp), the NAs in data_temp are converted to _FillValues (-999.99). For clarification I added two examples below.

Regards

Raphael


Examples:

> # *************************************
> # without data manipulation
> # *************************************
>
> # copy data
> data_temp2 <- data_temp
>
> # show what we have
> data_temp[1:5, 1:5, 1]
           [,1]       [,2]      [,3]       [,4]       [,5]
[1,]         NA         NA        NA 0.03887696 0.04786269
[2,]         NA         NA        NA 0.07736548 0.09524715
[3,]         NA         NA        NA 0.11508099 0.14167993
[4,]         NA         NA        NA 0.15164665 0.18669710
[5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885
>
> data_temp2[1:5, 1:5, 1]
           [,1]       [,2]      [,3]       [,4]       [,5]
[1,]         NA         NA        NA 0.03887696 0.04786269
[2,]         NA         NA        NA 0.07736548 0.09524715
[3,]         NA         NA        NA 0.11508099 0.14167993
[4,]         NA         NA        NA 0.15164665 0.18669710
[5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885
>
> # write to netCDF connection
> ncvar_put( ncid_new, var_temp, data_temp )
>
> # show what we have
> data_temp[1:5, 1:5, 1]
              [,1]          [,2]         [,3]       [,4]       [,5]
[1,] -999.99000000 -999.99000000 -999.9900000 0.03887696 0.04786269
[2,] -999.99000000 -999.99000000 -999.9900000 0.07736548 0.09524715
[3,] -999.99000000 -999.99000000 -999.9900000 0.11508099 0.14167993
[4,] -999.99000000 -999.99000000 -999.9900000 0.15164665 0.18669710
[5,]    0.04786269    0.09524715    0.1416799 0.18669710 0.22984885
>
> data_temp2[1:5, 1:5, 1]
              [,1]          [,2]         [,3]       [,4]       [,5]
[1,] -999.99000000 -999.99000000 -999.9900000 0.03887696 0.04786269
[2,] -999.99000000 -999.99000000 -999.9900000 0.07736548 0.09524715
[3,] -999.99000000 -999.99000000 -999.9900000 0.11508099 0.14167993
[4,] -999.99000000 -999.99000000 -999.9900000 0.15164665 0.18669710
[5,]    0.04786269    0.09524715    0.1416799 0.18669710 0.22984885




> # *************************************
> # with data manipulation
> # *************************************
>
> # show what we have
> data_temp[1:5, 1:5, 1]
           [,1]       [,2]      [,3]       [,4]       [,5]
[1,]         NA         NA        NA 0.03887696 0.04786269
[2,]         NA         NA        NA 0.07736548 0.09524715
[3,]         NA         NA        NA 0.11508099 0.14167993
[4,]         NA         NA        NA 0.15164665 0.18669710
[5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885
>
> data_temp2[1:5, 1:5, 1]
           [,1]       [,2]      [,3]       [,4]       [,5]
[1,]         NA         NA        NA 0.03887696 0.04786269
[2,]         NA         NA        NA 0.07736548 0.09524715
[3,]         NA         NA        NA 0.11508099 0.14167993
[4,]         NA         NA        NA 0.15164665 0.18669710
[5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885
>
> # do some manipulations
> data_temp <- data_temp * 1.0
> data_temp2 <- data_temp2 * 1.0
>
> # write to netCDF connection
> ncvar_put( ncid_new, var_temp, data_temp )
>
> # show what we have
> data_temp[1:5, 1:5, 1]
              [,1]          [,2]         [,3]       [,4]       [,5]
[1,] -999.99000000 -999.99000000 -999.9900000 0.03887696 0.04786269
[2,] -999.99000000 -999.99000000 -999.9900000 0.07736548 0.09524715
[3,] -999.99000000 -999.99000000 -999.9900000 0.11508099 0.14167993
[4,] -999.99000000 -999.99000000 -999.9900000 0.15164665 0.18669710
[5,]    0.04786269    0.09524715    0.1416799 0.18669710 0.22984885
>
> data_temp2[1:5, 1:5, 1]
           [,1]       [,2]      [,3]       [,4]       [,5]
[1,]         NA         NA        NA 0.03887696 0.04786269
[2,]         NA         NA        NA 0.07736548 0.09524715
[3,]         NA         NA        NA 0.11508099 0.14167993
[4,]         NA         NA        NA 0.15164665 0.18669710
[5,] 0.04786269 0.09524715 0.1416799 0.18669710 0.22984885
>
>
> # *************************************
> # RESULT
> # with manipulation of data_temp2 the variable is copied and NAs remain NAs
> # but manipulation of data_temp doesn't help

Von: davidwilliampierce at gmail.com [mailto:davidwilliampierce at gmail.com] Im Auftrag von David W. Pierce
Gesendet: Montag, 14. August 2017 17:29
An: Felber Raphael Agroscope <raphael.felber at agroscope.admin.ch>
Cc: r-help at r-project.org
Betreff: Re: [R] ncdf4: Why are NAs converted to _FillValue when saving?

On Mon, Aug 14, 2017 at 5:29 AM, <raphael.felber at agroscope.admin.ch<mailto:raphael.felber at agroscope.admin.ch>> wrote:

Dear all

I'm a newbie regarding netcdf data. Today I realized that I maybe do not understand some basics of the netcdf. I want to create a *.nc file containing three variables for Switzerland. All data outside of the country are NAs. The third variable is calculated from the first two variables. Basically there is no problem to do that. I copy the file with the data of the first variable, open this file with 'write=TRUE' (nc1 <- nc_open()), read the data to 'var1', open the other file (nc2 <- nc_open()), read the data to variable 'var2', put this variable to the file (nc1) and calculate the third variable based on var1 and var2.

So far everything is fine. But I figured out that when I write the data 'var2' to nc1, all NAs in this variable are converted to the _FillValue-value. Clearly, I expect that all NAs are converted to the _FillValue in the file, but I do not expect that also the NAs in 'var2' (i.e. the data which can be called in the R-console) is changed. Since I use this data for further calculations, the NAs should remain.

Is that a bug or intended? Below you find a minimal example (adapted from the code in the netcdf4 manual) of the – in my eye – strange behavior.

​HI Raphael,

I'm going to claim that this is more of an R question than a ncdf4 question per se. For example, you will notice that if you multiply data_temp2 times 1.0 (leaving values unchanged) or add zero to data_temp2, then the behavior is what you are expecting. Same holds if you multiply data_temp by 1.0 or add zero to it.  It would seem that R does the equivalent of assigning another pointer to the data stored in data_temp rather than copying data_temp until ​either data_temp or data_temp2 is operated upon, at which point a copy is made. I personally did not realize this was part of R's magic.

Regards,

--Dave




Minimal working example (adapted from netcdf4 manual):

library(ncdf4)
#----------------
# Make dimensions
#----------------
xvals <- 1:360
yvals <- -90:90
nx <- length(xvals)
ny <- length(yvals)
xdim <- ncdim_def('Lon','degreesE', xvals )
ydim <- ncdim_def('Lat','degreesE', yvals )
tdim <- ncdim_def('Time','days since 1900-01-01', 0, unlim=TRUE )
#---------
# Make var
#---------
mv <- 1.e30     # missing value
var_temp <- ncvar_def('Temperature','K', list(xdim,ydim,tdim), mv )
#---------------------
# Make new output file
#---------------------
output_fname <-'test_real3d.nc<http://test_real3d.nc>'
ncid_new <- nc_create( output_fname, list(var_temp))
#-------------------------------
# Put some test data in the file
#-------------------------------
data_temp <- array(0.,dim=c(nx,ny,1))
for( j in 1:ny )
for( i in 1:nx )
data_temp[i,j,1] <- sin(i/10)*sin(j/10)

# add some NAs
data_temp[1:10, 1:5, 1] <- NA

# copy data
data_temp2 <- data_temp

# show what we have
data_temp[1:12, 1:7, 1]
data_temp2[1:12, 1:7, 1]

# write to netCDF connection
ncvar_put( ncid_new, var_temp, data_temp, start=c(1,1,1), count=c(nx,ny,1))

# show what we have now
data_temp[1:12, 1:7, 1]
data_temp2[1:12, 1:7, 1]

# Why are there no more NAs in data_temp? • ncvar_put changed NAs to _FillValue-value
# But why are the NAs in data_temp2 also changed to _FillValue?
#--------------------------
# Close
#--------------------------
nc_close( ncid_new )

------------------------------------------------------------------------------------
Raphael Felber, Dr. sc.
Wissenschaftlicher Mitarbeiter, Klima und Lufthygiene

Eidgenössisches Departement für
Wirtschaft, Bildung und Forschung WBF
Agroscope
Forschungsbereich Agrarökologie und Umwelt

Reckenholzstrasse 191, 8046 Zürich
Tel. 058 468 75 11
Fax 058 468 72 01
raphael.felber at agroscope.admin.ch<mailto:raphael.felber at agroscope.admin.ch><mailto:raphael.felber at agroscope.admin.ch<mailto:raphael.felber at agroscope.admin.ch>>
www.agroscope.ch<http://www.agroscope.ch><http://www.agroscope.ch/>


        [[alternative HTML version deleted]]

______________________________________________
R-help at r-project.org<mailto:R-help at r-project.org> mailing list -- To UNSUBSCRIBE and more, see
https://stat.ethz.ch/mailman/listinfo/r-help
PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
and provide commented, minimal, self-contained, reproducible code.



--
David W. Pierce
Division of Climate, Atmospheric Science, and Physical Oceanography
Scripps Institution of Oceanography, La Jolla, California, USA
(858) 534-8276<tel:(858)%20534-8276> (voice)  /  (858) 534-8561<tel:(858)%20534-8561> (fax)    dpierce at ucsd.edu<mailto:dpierce at ucsd.edu>

	[[alternative HTML version deleted]]



More information about the R-help mailing list