[R-sig-Geo] get data from nc file

Ben Tupper btupper @ending from bigelow@org
Thu Dec 27 01:01:29 CET 2018


Hi,

Yikes.  I don't think there is any other way as the attributes are sort of buried in the string; that's unfortunate.  I guess you could at least make a reusable function assuming you'll be doing this again or looking to pull other attributes.  Something like this...


#' Extract one of the GLobal Attributes of a TRMM NetCDF as a named vector 
#'
#' @param nc the ncdf4 object
#' @param name the name of the global attribute
#' @param sep the separator used to delimit fields in the attribute
#' @return named character vector of attributes
nc_att_split <- function(nc, name = "FileHeader", sep = ";\n"){
	
	a1 <- ncdf4::ncatt_get(nc, 0)[[name[1]]]
	if (is.null(a1)) return(a1)
	
	a2 <- strsplit(a1,";\n", fixed = TRUE)[[1]]
	aa <- strsplit(a2, "=", fixed = TRUE)
	
	x <- sapply(aa,
		function(s) x = if(length(s) <= 1) "" else s[2]
		)	
	names(x) <- sapply(aa,
		function(s) x = if(length(s) <= 1) "unknown" else s[1]
		)
	
	x
}


nc <- ncdf4::nc_open("3B43.20080101.7A.HDF.nc")
x <- nc_att_split(nc)
as.Date(x[['StartGranuleDateTime']], format = "%Y-%m-%dT%H:%M:%OSZ")
[1] "2008-01-01"


Cheers,
Ben

> On Dec 26, 2018, at 3:42 PM, Antonio Silva <aolinto.lst using gmail.com> wrote:
> 
> Dear list members
> 
> I downloaded some nc files with precipitation data from
> https://pmm.nasa.gov/data-access/downloads/trmm (Level 3 3B43:
> Multisatellite Precipitation). For the image link see the global attribute
> "history" (below).
> 
> With ncdf4::nc_open I cloud open the file (nc.data <- nc_open("
> 3B43.20080101.7A.HDF.nc")
> 
> I want to extract the "StartGranuleDateTime" but it is inside the global
> attribute FileHeader (see below).
> 
> With ncatt_get(nc.data,0,"FileHeader")$value I got
> [1]
> "AlgorithmID=3B43;\nAlgorithmVersion=3B43_7.0;\nFileName=3B43.20080101.7A.HDF;\nGenerationDateTime=2012-11-29T19:12:01.000Z;\nStartGranuleDateTime=2008-01-01T00:00:00.000Z;\nStopGranuleDateTime=2008-01-31T23:59:59.999Z;\nGranuleNumber=;\nNumberOfSwaths=0;\nNumberOfGrids=1;\nGranuleStart=;\nTimeInterval=MONTH;\nProcessingSystem=PPS;\nProductVersion=7A;\nMissingData=;\n"
> 
> Is there any way to extract only the string "2008-01-01T00:00:00.000Z"?
> 
> The best I could do was
> as.Date(substr(strsplit(ncatt_get(nc.data,0,"FileHeader")$value,";\n")[[1]][5],22,45),"%Y-%m-%dT%H:%M:%OSZ")
> 
> but probably, I suppose, there must be a more direct way of getting the
> data. I appreciate any suggestions.
> 
> Best regards,
> 
> Antonio Olinto
> Fisheries Institute
> Brazil
> 
> nc.data
> File 3B43.20080101.7A.HDF.nc (NC_FORMAT_CLASSIC):
> 
>     1 variables (excluding dimension variables):
>        float precipitation[nlat,nlon]
>            units: mm/hr
>            coordinates: nlon nlat
>            _FillValue: -9999.900390625
> 
>     2 dimensions:
>        nlon  Size:33
>            long_name: longitude
>            standard_name: longitude
>            units: degrees_east
>        nlat  Size:41
>            long_name: latitude
>            standard_name: latitude
>            units: degrees_north
> 
>    5 global attributes:
>        Grid.GridHeader: BinMethod=ARITHMETIC_MEAN;
> Registration=CENTER;
> LatitudeResolution=0.25;
> LongitudeResolution=0.25;
> NorthBoundingCoordinate=50;
> SouthBoundingCoordinate=-50;
> EastBoundingCoordinate=180;
> WestBoundingCoordinate=-180;
> Origin=SOUTHWEST;
> 
>        FileHeader: AlgorithmID=3B43;
> AlgorithmVersion=3B43_7.0;
> FileName=3B43.20080101.7A.HDF;
> GenerationDateTime=2012-11-29T19:12:01.000Z;
> StartGranuleDateTime=2008-01-01T00:00:00.000Z;
> StopGranuleDateTime=2008-01-31T23:59:59.999Z;
> GranuleNumber=;
> NumberOfSwaths=0;
> NumberOfGrids=1;
> GranuleStart=;
> TimeInterval=MONTH;
> ProcessingSystem=PPS;
> ProductVersion=7A;
> MissingData=;
> 
>        FileInfo: DataFormatVersion=m;
> TKCodeBuildVersion=1;
> MetadataVersion=m;
> FormatPackage=HDF Version 4.2 Release 4, January 25, 2009;
> BlueprintFilename=TRMM.V7.3B43.blueprint.xml;
> BlueprintVersion=BV_13;
> TKIOVersion=1.6;
> MetadataStyle=PVL;
> EndianType=LITTLE_ENDIAN;
> 
>        GridHeader: BinMethod=ARITHMETIC_MEAN;
> Registration=CENTER;
> LatitudeResolution=0.25;
> LongitudeResolution=0.25;
> NorthBoundingCoordinate=50;
> SouthBoundingCoordinate=-50;
> EastBoundingCoordinate=180;
> WestBoundingCoordinate=-180;
> Origin=SOUTHWEST;
> 
>        history: 2018-12-26 17:57:56 GMT Hyrax-1.13.4
> https://disc2.gesdisc.eosdis.nasa.gov:443/opendap/TRMM_L3/TRMM_3B43.7/2008/3B43.20080101.7A.HDF.nc?precipitation[604:636][3:43],nlat[3:43],nlon[604:636]
> 
> 	[[alternative HTML version deleted]]
> 
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo using r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

Ben Tupper
Bigelow Laboratory for Ocean Sciences
60 Bigelow Drive, P.O. Box 380
East Boothbay, Maine 04544
http://www.bigelow.org

Ecological Forecasting: https://eco.bigelow.org/



More information about the R-sig-Geo mailing list