[R-sig-Geo] Problem in extracting data from GRIB files

Michael Sumner mdsumner at gmail.com
Mon Jul 31 16:31:29 CEST 2017


The print-out of rdata is good, but I can't really address the issue you
say about NAs without knowing what is in "pt_dams". For that we need to
know what kind of thing it is:

   class(pt_dams)

what projection it's in

   projection(pt_dams)

and it's extent, and unfortunately for that you need to know what kind of
thing it is so either

    extent(pt_dams)  ## assuming it's a classed spatial-kind-of-thing

or this will vaguely give the idea

    head(pt_dams)  ## assuming it's a matrix or data frame

You've shown a lot of code but I think we only need to concentrated on
"rdata", as a representative of the raster data all your files will have,
and pt_dams and this line:

     data_dams_size<-extract(rdata,pts_dams)

(It's also possible to make the entire task much faster by building an
index at that first step, possibly with cellnumbers = TRUE

But, you'll have to make sure that extract line is doing what you need it
to first. It's very hard to do this by email, and you really haven't honed
into the key details yet. If the coordinates in pt_dams don't fall within
the extent of rdata, then you're in trouble (or just in the wrong
projection, probably). raster makes life easy if you know the coordinates
are in the same space as the raster, but none of this metadata stuff is in
any way consistent or complete so as ever "it depends".

Cheers, Mike.





On Mon, 31 Jul 2017 at 20:13 Miluji Sb <milujisb at gmail.com> wrote:

> Dear Mike,
>
> Thank you very much for your suggestion, apologies for the delayed reply -
> I did not have access to internet over the weekend.
>
> Here is the print-out for rdata:
>
> class       : RasterStack
> dimensions  : 150, 360, 54000, 8  (nrow, ncol, ncell, nlayers)
> resolution  : 1, 1  (x, y)
> extent      : -180, 180, -60, 90  (xmin, xmax, ymin, ymax)
> coord. ref. : +proj=longlat +a=6367470 +b=6367470 +no_defs
> names       : GLDAS_NOA//215507.020, GLDAS_NOA//215507.020,
> GLDAS_NOA//215507.020, GLDAS_NOA//215507.020, GLDAS_NOA//215507.020,
> GLDAS_NOA//215517.020, GLDAS_NOA//215517.020, GLDAS_NOA//215517.020
>
> I also made this change [rdata <-  raster::stack(lapply(filelist,
> function(x) raster(x,band=12)))] but still getting NAs. The full (ugly)
> code is below
>
> I would be grateful for any help. Thank you again!
>
> Sincerely,
>
> Milu
>
> ####
> for (y in year) {
>   setwd (y)
>
>   day <- dir(path=getwd(),full.names=TRUE, recursive=FALSE,pattern =
> "^[0-9]" )
>
>   n=0
>
>   for(d in day) {
>     setwd (d)
>
>     n=n+1
>
>
>     filelist<-list.files(pattern=".grb$")
>
>     rdata <-  raster::stack(lapply(filelist, function(x)
> raster(x,band=12)))
>
>     data_dams_size<-extract(rdata,pts_dams)
>     DF_data_dams_size<- as.data.frame(data_dams_size)
>
>     #Join ISO3, lon, lat, dams data, climate data
>     joined_dams <- cbind(foo_dams, DF_data_dams_size)
>
>     #Keep only ISO3  LON  LAT  id
>     joined_dams_reduced <- joined_dams
>     dams_codes <- joined_dams[,c(1:3)]
>
>     names(joined_dams_reduced) <- gsub("GLDAS_NOAH10_3H.A", "",
> names(joined_dams_reduced))
>     names(joined_dams_reduced) <- gsub(".020", "",
> names(joined_dams_reduced))
>
>     #From mm/s to mm_day
>     joined_dams_reduced[joined_dams_reduced==9999] <- NA
>
>     ##for prec, snow and runoff use:
>     #joined_dams_reduced$mm_day<-rowSums(joined_dams_reduced[,4:11])*10800
>
>     ##for temperature use:
>     joined_dams_reduced$mm_day<-rowMeans(joined_dams_reduced[,4:11])-273.15
>
>     joined_mm_day<-joined_dams_reduced[,12]
>
>     #Names
>     yr_name <- substr(filelist,18,21)
>     csvname <- paste(paste(yr_name, sep="_"), ".csv", sep="")
>     csvname <- csvname[1]
>
>     #Stack days to get full year
>     if (n==1) {joined_mm_day_year <- joined_mm_day} else
> {joined_mm_day_year <- cbind(joined_mm_day_year, joined_mm_day)}
>
>   }
>
>   #Join mm_day with dams codes
>   joined_mm_day_year_names=cbind(dams_codes,joined_mm_day_year)
>
>   newnames <-t(as.data.frame(c(paste ("day",
> seq(1,(ncol(joined_mm_day_year_names)-3),1), sep="_"))))
>   names(joined_mm_day_year_names)[4:ncol(joined_mm_day_year_names)]=
> newnames
>
>
>   write.csv(joined_mm_day_year_names,
> file=file.path("C:/Users/Desktop/temp/",csvname))
>
> }
>
>
> On Sat, Jul 29, 2017 at 12:36 AM, Michael Sumner <mdsumner at gmail.com>
> wrote:
>
>>
>>
>> On Sat, 29 Jul 2017 at 02:21 Miluji Sb <milujisb at gmail.com> wrote:
>>
>>> Dear all,
>>>
>>> I have a set of coordinates for cities:
>>>
>>> structure(list(lon = c(3.7174243, 3.2246995, 2.928656, 33.3822764,
>>> 10.40237, 24.7535746, 7.2619532, -0.370679, -4.486076, -4.097899
>>> ), lat = c(51.0543422, 51.209348, 51.21543, 35.1855659, 55.403756,
>>> 59.4369608, 43.7101728, 49.182863, 48.390394, 47.997542)), .Names =
>>> c("lon",
>>> "lat"), na.action = structure(c(5L, 6L, 31L, 55L, 59L, 61L, 67L,
>>> 68L), .Names = c("5", "6", "31", "55", "59", "61", "67", "68"
>>> ), class = "omit"), row.names = c(1L, 2L, 3L, 4L, 7L, 8L, 9L,
>>> 10L, 11L, 12L), class = "data.frame")
>>>
>>> I am trying to extract climatic data from GLDAS climatic data (1 degree
>>> x 1
>>> degree) GRIB files with the following code:
>>>
>>> # Convert to spatial points
>>> pts_dams <- SpatialPoints(lonlat,proj4string=CRS("+proj=longlat"))
>>>
>>> # Extract
>>>  filelist<-list.files(pattern=".grb$")
>>>
>>>
>> This part is a bit wrong:
>>
>>
>>>     data<-stack(filelist[1])
>>>     data at layers<-sapply(filelist, function(x) raster(x,band=12))
>>>
>>
>> Better would be
>>
>> rdata <-  raster::stack(lapply(filelist, function(x) raster(x,band=12)))
>>
>> For the rest we will need to see details about the data, so please do
>>
>> print(rdata)
>>
>> and send the resulting print-out.
>>
>>  A minor thing "data" is already a function, so it's advisable not to use
>> it as a name - if  only to reduce possible confusion.
>>
>> Also, it's not obvious that the process by which raster(file) goes though
>> (it passes down to rgdal::readGDAL) will result in a correct interpretation
>> as the data source was intended, since GRIB is a domain-specific format for
>> time-series and volume-slice series data and the required translation to
>> the GDAL and raster data models is not always straight-forward.
>>
>> It totally can work though, I do it routine with many sources but they
>> all need some oversight to ensure that upfront translation is complete and
>> appropriate.
>>
>> It may just require setting raster::extent and/or raster::projection, but
>> there's no way to know without exploration.
>>
>> Cheers, Mike.
>>
>>
>>
>>>     data_dams_size<-extract(data,pts_dams)
>>>     DF_data_dams_size<- as.data.frame(data_dams_size)
>>>
>>> Unfortunately, I get mostly NAs for the data, it seems that there's an
>>> issue with the CRS projections for the city coordinates. Is there a
>>> specific projection for city level coordinates? Or am I doing something
>>> completely wrong? Thank you!
>>>
>>> Sincerely,
>>>
>>> Milu
>>>
>>>         [[alternative HTML version deleted]]
>>>
>>> _______________________________________________
>>> R-sig-Geo mailing list
>>> R-sig-Geo at r-project.org
>>> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>>>
>> --
>> Dr. Michael Sumner
>> Software and Database Engineer
>> Australian Antarctic Division
>> 203 Channel Highway
>> Kingston Tasmania 7050 Australia
>>
>>
> --
Dr. Michael Sumner
Software and Database Engineer
Australian Antarctic Division
203 Channel Highway
Kingston Tasmania 7050 Australia

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list