[R-sig-Geo] Programmatically convert raster stack in data frame based on polygon extraction

Thiago V. dos Santos thi_veloso at yahoo.com.br
Fri Oct 30 00:14:01 CET 2015


Thanks a lot again, Loïc. It worked exactly as I was planned.
 Greetings,
 -- Thiago V. dos Santos

PhD student
Land and Atmospheric Science
University of Minnesota



On Thursday, October 29, 2015 4:15 PM, Loïc Dutrieux <loic.dutrieux at wur.nl> wrote:
Hi Thiago,

extract() and some dataframe manipulation should do the trick. See 
comments in line.

Cheers,
Loïc

On 10/29/2015 09:06 PM, Thiago V. dos Santos wrote:
> Hi all,
>
> I am trying to extract temperature values from a raster stack for about 400 municipalities in Brazil. My final goal is to create a data frame that is going to be used as a database for an interactive map server - probably using shiny and leaflet.
>
Cool project

> The final data frame would look like this:
>
>
>> head(df)
> Location         Var Cut Year Month Freq
> Campinas  temperature  10 2010   1    11
> Campinas  temperature  10 2010   2    19
> Campinas  temperature  10 2010   3    30
> Campinas  temperature  10 2010   4    29
> Campinas  temperature  10 2010   5    31
> Campinas  temperature  10 2010   6    30
>
>
> I have global raster stacks with daily data and I am counting, for each month in the raster, the number of days above certain temperature threshold. Please see below:
>
>
> library(raster)
> library(zoo)
> library(maptools)
# additional packages for dataframes manipulation
library(dplyr)
library(tidyr)
>
> # Create a rasterStack similar to my data - same dimensions and layer namesr <- raster(ncol=360, nrow=180)
> s <- stack(lapply(1:730, function(x) setValues(r, runif(ncell(r),min=0,max=30))))
> idx <- seq(as.Date("2010/1/1"), by = "day", length.out = 730)
> s <- setZ(s, idx)
> s
>
> # Define functions for 10, 15, 20 and 25 degrees - Thanks Loïc in my previous question
> fun1 <- function(x, na.rm) {
> sum(x > 10, na.rm)
> }
>
> fun2 <- function(x, na.rm) {
> sum(x > 15, na.rm)
> }
>
> fun3 <- function(x, na.rm) {
> sum(x > 20, na.rm)
> }
>
> fun4 <- function(x, na.rm) {
> sum(x > 25, na.rm)
> }
>
> # Count number of days above the threshold temperature
> days.above.10 <- zApply(s, by=as.yearmon, fun = fun1)
> days.above.15 <- zApply(s, by=as.yearmon, fun = fun2)
> days.above.20 <- zApply(s, by=as.yearmon, fun = fun3)
> days.above.25 <- zApply(s, by=as.yearmon, fun = fun4)
>
>
> Now, what I would like to do is to programmatically extract values for each location on my study area. The locations are defined as a shapefile with municipal contours of the Sao Paulo state in Brazil.
>
>
> In this example, however, just for reproducibility's sake, I will be using a world polygon. But keep in mind that in my actual data the polygons will be much smaller.
>
>
> # Import *sample* polygon data and subset only five "locations"
> data(wrld_simpl)
> locs <- subset(wrld_simpl, wrld_simpl at data$NAME %in% c("Argentina","Bolivia","Brazil","Paraguay","Uruguay"))
>
> # Plot
> plot(days.above.10,1)
> plot(locs,add=T)
>

# Extract values for all polygons
spdf <- raster::extract(days.above.10, locs, fun = median, na.rm = TRUE, 
sp = TRUE)

# There is also a df = TRUE option in extract, but it returns only the 
extracted raster values, without binding them with the 
spatialpolygondataframe attributes. I think

# Get dataframe out of spdf
df <- spdf at data

df1 <- select(df, NAME, Jan.2010:Dec.2010)
df2 <- gather(df1, period, freq, -NAME)
df3 <- separate(df2, period, into = c('month', 'year'))
# Can you add these columns "manually" or does it need to be automated?
df3$variable <- 'temperature'
df3$cut <- 10

# If you do the same for cut == 15, etc, you can then rbind() them


>
> I feel like half of the work is done, but I am just grasping with the conversion to data frames.
>
> Based on this self-contained example I provided, what would be the best strategy to come out with a data frame per location, like this?
>
>> head(Argentina.df)
> Location         Var Cut Year Month Freq
> Argentina temperature  10 2010     1  11
> Argentina temperature  10 2010     2  19
> Argentina temperature  10 2010     3  30
> Argentina temperature 10 2010     4  12
> Argentina temperature 10 2010     5  17
> Argentina temperature 10 2010     6  14
>
>
>> head(Bolivia.df)
> Location         Var Cut Year Month Freq
> Bolivia   temperature  10 2010     1  29
> Bolivia   temperature  10 2010     2  31
> Bolivia   temperature  10 2010     3  30
>
> Bolivia   temperature 10 2010     4  17
> Bolivia   temperature 10 2010     5  19
> Bolivia   temperature 10 2010     6  12
>
> and so on.
>
> Note that "cut" refers to the temperature thresholds defined in the functions above. Each cut should come from the equivalent raster stack: days.above.10, days.above.15 and so on.
>
> I much appreciate any input.
> Greetings,
>   -- Thiago V. dos Santos
>
> PhD student
> Land and Atmospheric Science
> University of Minnesota
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo

>

_______________________________________________
R-sig-Geo mailing list
R-sig-Geo at r-project.org
https://stat.ethz.ch/mailman/listinfo/r-sig-geo



More information about the R-sig-Geo mailing list