[R] netcdf data precision or least significant digit
Ismail SEZEN
sezenismail at gmail.com
Fri Jul 8 03:02:35 CEST 2016
Thank you Roy. If I use "round(uwind, digits = 2)”, all data will have 2 decimal places after decimal point. It’s ok. But How do you know you should round the number to 2 decimal digits? According to definitions of precision and least_significant_digit, should I round to 2 decimal digits or 1 decimal digit?
For instance, If you check the header information of omega.2015.nc file it says;
$ ncdump -h omega.2015.nc
...
omega:precision = 3s;
omega:least_significant_digit = 3s;
…
So, I need to round values to 3 decimal places after point?
and if you check the output of rhum.2015.nc;
$ ncdump -h rhum.2015.nc
...
rhum:precision = 2s ;
rhum:least_significant_digit = 0s ;
…
Then I need to round values to 2 decimal places after point?
Should I accomplish the rounding operation according to precision or least_significant_digit attributes? I think someone put these attributes in netcdf files for some reason. Also I belive, if required, this kind of an operation must be done in related package but author said that it is nothing to do with ncdf4 package.
Please, forgive me for taking your time.
> On 08 Jul 2016, at 03:21, Roy Mendelssohn - NOAA Federal <roy.mendelssohn at noaa.gov> wrote:
>
> After looking at the file, doing an extract say into the variable uwind, if I do:
>
> str(uwind)
>
> I see what I expect, but if I just do:
>
> uwind
>
>
> I see what you are seeing. Try:
>
> uwindnew <- round(uwind, digits = 2)
>
>
> and see if that gives you the results you would expect.
>
> HTH,
>
> -Roy
>
>> On Jul 7, 2016, at 4:49 PM, Ismail SEZEN <sezenismail at gmail.com> wrote:
>>
>> Thank you Roy.
>>
>> I use NCEP/NCAR Reanalysis 2 data [1]. More precisely, u-wind data of the year 2015 [2]. I am also pretty sure that the variables like scale_factor or add_offset should be precise like 0.01 or 187.65 but somehow (I hope this is not an issue originated by me) they are not, including data. Also let me note that I already contacted to author of ncdf4 package and also sent an email to ESRL, too, but no luck yet.
>>
>> For a vectoral data, absolute and mutual u components of wind speed at the poles must be equal. For instance, at “2015-01-01 00 GMT”, u-wind at longitude=0 and latitude=90 is 9.1999979 m/s and u-wind at longitude=180 and latitude=90 is -9.2000017 m/s. Minus sign comes from positive north direction. Physically, their absolute values must be equal.
>>
>> 1- http://www.esrl.noaa.gov/psd/data/gridded/data.ncep.reanalysis2.html
>> 2- ftp://ftp.cdc.noaa.gov/Datasets/ncep.reanalysis2.dailyavgs/pressure/uwnd.2015.nc
>>
>>
>>
>>> On 08 Jul 2016, at 02:27, Roy Mendelssohn - NOAA Federal <roy.mendelssohn at noaa.gov> wrote:
>>>
>>> Hi Ismail:
>>>
>>> Can you point me to a particular netcdf file you are working with. I would like to play with it for awhile. I am pretty certain the scale factor is 0.01 and what you are seeing in rounding error (or mor precisely I should say problems with representations of floating point numbers), but i would like to see if there is away around this.
>>>
>>> Thank,
>>>
>>> -Roy
>>>
>>>> On Jul 7, 2016, at 4:16 PM, Ismail SEZEN <sezenismail at gmail.com> wrote:
>>>>
>>>> Thank you very much Jeff. I think I’m too far to be able to explain myself. Perhaps, this is the wrong list for this question but I sent it in hope there is someone has deep understanding of netcdf data and use R. Let me tell the story simpler. Assume that you read a numeric vector of data from a netcdf file:
>>>>
>>>> data <- c(9.1999979, 8.7999979, 7.9999979, 3.0999980, 6.1000018, 10.1000017, 10.4000017, 9.2000017)
>>>>
>>>> you know that the values above are a model output and also you know that, physically, first and last values must be equal but somehow they are not.
>>>>
>>>> And now, you want to use “periodic” spline for the values above.
>>>>
>>>> spline(1:8, data, method = “periodic”)
>>>>
>>>> Voila! spline method throws a warning message: “spline: first and last y values differ - using y[1] for both”. Then I go on digging and discover 2 attributes in netcdf file: “precision = 2” and “least_significant_digit = 1”. And I also found their definitions at [1].
>>>>
>>>> precision -- number of places to right of decimal point that are significant, based on packing used. Type is short.
>>>> least_significant_digit -- power of ten of the smallest decimal place in unpacked data that is a reliable value. Type is short.
>>>>
>>>> Please, do not condemn me, english is not my main language :). At this point, as a scientist, what would you do according to explanations above? I think I didn’t exactly understand the difference between precision and least_significant_digit. One says “significant” and latter says “reliable”. Should I round the numbers to 2 decimal places or 1 decimal place after decimal point?
>>>>
>>>> Thanks,
>>>>
>>>> 1- http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml
>>>>
>>>>
>>>>> On 08 Jul 2016, at 01:29, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
>>>>>
>>>>> Correction:
>>>>>
>>>>> ?options (not par)
>>>>> --
>>>>> Sent from my phone. Please excuse my brevity.
>>>>>
>>>>> On July 7, 2016 3:26:06 PM PDT, Jeff Newmiller <jdnewmil at dcn.davis.ca.us> wrote:
>>>>>> Same as with any floating point numeric computation environment... you
>>>>>> don't. There is always uncertainty in any floating point number... it
>>>>>> is just larger in this data than you might be used to.
>>>>>>
>>>>>> Once you get to the stage where you want to output values, read up on
>>>>>>
>>>>>> ?round
>>>>>> ?par (digits)
>>>>>>
>>>>>> and don't worry about the incidental display of extra digits prior to
>>>>>> presentation (output).
>>>>>> --
>>>>>> Sent from my phone. Please excuse my brevity.
>>>>>>
>>>>>> On July 7, 2016 12:50:54 AM PDT, Ismail SEZEN <sezenismail at gmail.com>
>>>>>> wrote:
>>>>>>> Hello,
>>>>>>>
>>>>>>> I use ncdf4 and ncdf4.helpers packages to get wind data from ncep/ncar
>>>>>>> reanalysis ncetcdf files. But data is in the form of (9.199998,
>>>>>>> 8.799998, 7.999998, 3.099998, -6.8000018, …). I’m aware of precision
>>>>>>> and least_significant_digit attributes of ncdf4 object [1]. For uwnd
>>>>>>> data, precision = 2 and least_significant_digits = 1. My doubt is that
>>>>>>> should I round data to 2 decimal places or 1 decimal place after
>>>>>>> decimal point?
>>>>>>>
>>>>>>> Same issue is valid for some header info.
>>>>>>>
>>>>>>> Output of ncdf4 object:
>>>>>>>
>>>>>>>
>>>>>>> Output of ncdump on terminal:
>>>>>>>
>>>>>>>
>>>>>>> for instance, ncdump's scale factor is 0.01f but ncdf4 object’s
>>>>>>> scale_factor is 0.00999999977648258. You can notice same issue for
>>>>>>> actual_range and add_offset. Also a similar issue exist for the data.
>>>>>>> How can I truncate those extra unsignificant decimal places or round
>>>>>>> the numbers to significant decimal places?
>>>>>>>
>>>>>>> 1 -
>>>>>>> http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml
>>>>>>> <http://www.esrl.noaa.gov/psd/data/gridded/conventions/cdc_netcdf_standard.shtml>
>>>>>>> ______________________________________________
>>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>>> PLEASE do read the posting guide
>>>>>>> http://www.R-project.org/posting-guide.html
>>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>>
>>>>>> ______________________________________________
>>>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>>>> PLEASE do read the posting guide
>>>>>> http://www.R-project.org/posting-guide.html
>>>>>> and provide commented, minimal, self-contained, reproducible code.
>>>>>
>>>>
>>>>
>>>> [[alternative HTML version deleted]]
>>>>
>>>> ______________________________________________
>>>> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
>>>> https://stat.ethz.ch/mailman/listinfo/r-help
>>>> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
>>>> and provide commented, minimal, self-contained, reproducible code.
>>>
>>> **********************
>>> "The contents of this message do not reflect any position of the U.S. Government or NOAA."
>>> **********************
>>> Roy Mendelssohn
>>> Supervisory Operations Research Analyst
>>> NOAA/NMFS
>>> Environmental Research Division
>>> Southwest Fisheries Science Center
>>> ***Note new address and phone***
>>> 110 Shaffer Road
>>> Santa Cruz, CA 95060
>>> Phone: (831)-420-3666
>>> Fax: (831) 420-3980
>>> e-mail: Roy.Mendelssohn at noaa.gov www: http://www.pfeg.noaa.gov/
>>>
>>> "Old age and treachery will overcome youth and skill."
>>> "From those who have been given much, much will be expected"
>>> "the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
>>>
>>
>
> **********************
> "The contents of this message do not reflect any position of the U.S. Government or NOAA."
> **********************
> Roy Mendelssohn
> Supervisory Operations Research Analyst
> NOAA/NMFS
> Environmental Research Division
> Southwest Fisheries Science Center
> ***Note new address and phone***
> 110 Shaffer Road
> Santa Cruz, CA 95060
> Phone: (831)-420-3666
> Fax: (831) 420-3980
> e-mail: Roy.Mendelssohn at noaa.gov www: http://www.pfeg.noaa.gov/
>
> "Old age and treachery will overcome youth and skill."
> "From those who have been given much, much will be expected"
> "the arc of the moral universe is long, but it bends toward justice" -MLK Jr.
>
More information about the R-help
mailing list