[R-sig-Geo] readGDAL() and HDF5 files

Sebastian P. Luque spluque at gmail.com
Sat Jan 24 00:05:10 CET 2009


On Sat, 24 Jan 2009 08:11:15 +1100,
Michael Sumner <mdsumner at utas.edu.au> wrote:

> Hi Sebastian, When you say the geographic coordinates are not a
> regular grid - is it that the actual grid is Mercator but the NetCDF
> file stores an X and Y vector separately for each unique longitude and
> latitude? I've seen this many times, but never with enough metadata to
> determine that without guessing. I've seen some documents that refer
> to the source grid as being in Mercator, but never any that mention
> explicitly the method used to generate the NetCDF file from those.

> If this is the case for your data, I've had success by figuring out an
> offset/scale value that work sufficiently by assuming a Mercator grid

> Specifically this one but sometimes with an extra X offset to overcome
> hemisphere shift.

> CRSargs(CRS("+proj=merc"))

> I don't have the details available today, but I can dig them up if
> that sounds promising. Also, I'd be interested to hear any details you
> have about the grids you are using, whether they use this
> Mercator-kludge or not.

Thanks for the feedback.

You're right in that the geographic coordinates are not a regular grid,
but separate data subsets in the NetCDF file, with the actual grid being
in a polar stereographic projection (the one in my first post).  The
actual coordinates of the grid are not available in any file (NetCDF nor
HDF5), and neither these nor the geographical coordinates in the HDF5
files, but that doesn't matter in my case because I need to subset a
smaller area, grid at an appropriate resolution using the geographical
coordinates, and reproject to a different projection for my area.  After
all, it's simple enough to access the geographical coordinates from any
of the NetCDF files to map the grid onto such a coordinate system.

Therefore, for working with these grids in R, I'd be happy if I could
just load the grid correctly using the arbitrary grid coordinates that
readGDAL() builds from the HDF5 file.  I haven't managed to read the
NetCDF files into R (R segfaults), so things look better with HDF5 for
working in R with these data.  The example I showed (and that Roger
reproduced) builds a grid with the wrong dimensions.  This is what the
header of the whole file (this is a new file, with the same structure as
the one I showed) says (using HDF5 tools,
http://hdf.ncsa.uiuc.edu/HDF5):

---<--------------------cut here---------------start------------------->---
$ h5dump -H ice_conc_nh_200901011200.hdf 
HDF5 "ice_conc_nh_200901011200.hdf" {
GROUP "/" {
   GROUP "Data" {
      DATASET "data[00]" {
         DATATYPE  H5T_IEEE_F32LE
         DATASPACE  SIMPLE { ( 760, 1120 ) / ( 760, 1120 ) }
         ATTRIBUTE "Description" {
            DATATYPE  H5T_STRING {
                  STRSIZE 16;
                  STRPAD H5T_STR_NULLTERM;
                  CSET H5T_CSET_ASCII;
                  CTYPE H5T_C_S1;
               }
            DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
         }
      }
   }
   DATASET "Header" {
      DATATYPE  H5T_COMPOUND {
         H5T_STRING {
            STRSIZE 32;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         } "source";
         H5T_STRING {
            STRSIZE 16;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         } "product";
         H5T_STRING {
            STRSIZE 16;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         } "area";
         H5T_STRING {
            STRSIZE 128;
            STRPAD H5T_STR_NULLTERM;
            CSET H5T_CSET_ASCII;
            CTYPE H5T_C_S1;
         } "projstr";
         H5T_STD_U32LE "iw";
         H5T_STD_U32LE "ih";
         H5T_STD_U16LE "z";
         H5T_IEEE_F32LE "Ax";
         H5T_IEEE_F32LE "Ay";
         H5T_IEEE_F32LE "Bx";
         H5T_IEEE_F32LE "By";
         H5T_STD_U32LE "year";
         H5T_STD_U16LE "month";
         H5T_STD_U16LE "day";
         H5T_STD_U16LE "hour";
         H5T_STD_U16LE "minute";
      }
      DATASPACE  SIMPLE { ( 1 ) / ( 1 ) }
   }
}
}
---<--------------------cut here---------------end--------------------->---

and the grid should have 1120 rows and 760 columns, as displayed by
gdalinfo on the actual data:

---<--------------------cut here---------------start------------------->---
$ gdalinfo HDF5:"ice_conc_nh_200901011200.hdf"://Data/data[00]
Driver: HDF5Image/HDF5 Dataset
Files: none associated
Size is 1120, 760
Coordinate System is:
GEOGCS["WGS 84",
    DATUM["WGS_1984",
        SPHEROID["WGS 84",6378137,298.257223563,
            AUTHORITY["EPSG","7030"]],
        TOWGS84[0,0,0,0,0,0,0],
        AUTHORITY["EPSG","6326"]],
    PRIMEM["Greenwich",0,
        AUTHORITY["EPSG","8901"]],
    UNIT["degree",0.0174532925199433,
        AUTHORITY["EPSG","9108"]],
    AXIS["Lat",NORTH],
    AXIS["Long",EAST],
    AUTHORITY["EPSG","4326"]]
Corner Coordinates:
Upper Left  (    0.0,    0.0)
Lower Left  (    0.0,  760.0)
Upper Right ( 1120.0,    0.0)
Lower Right ( 1120.0,  760.0)
Center      (  560.0,  380.0)
Band 1 Block=1120x1 Type=Float32, ColorInterp=Undefined
  Metadata:
    data[00]:Description=Ice Conc
---<--------------------cut here---------------end--------------------->---

but readGDAL() apparently read the dimensions in the opposite order:

---<--------------------cut here---------------start------------------->---
R> summary(ice)
Object of class SpatialGridDataFrame
Coordinates:
  min  max
x   0 1120
y   0  760
Is projected: NA 
proj4string : [NA]
Number of points: 2
Grid attributes:
  cellcentre.offset cellsize cells.dim
x               0.5        1      1120
y               0.5        1       760
Data attributes:
   Min. 1st Qu.  Median    Mean 3rd Qu.    Max. 
 -32800     -99     -99    -117       0     100 
---<--------------------cut here---------------end--------------------->---

Does this make sense? and could this be the problem?  The data that are
read into the `ice' object do look ok via summary(ice), but the way
they're mapped onto the grid does not.


Cheers,

-- 
Seb




More information about the R-sig-Geo mailing list