[R-sig-Geo] Programmatically convert raster stack in data frame based on polygon extraction
Loïc Dutrieux
loic.dutrieux at wur.nl
Thu Oct 29 22:13:46 CET 2015
Hi Thiago,
extract() and some dataframe manipulation should do the trick. See
comments in line.
Cheers,
Loïc
On 10/29/2015 09:06 PM, Thiago V. dos Santos wrote:
> Hi all,
>
> I am trying to extract temperature values from a raster stack for about 400 municipalities in Brazil. My final goal is to create a data frame that is going to be used as a database for an interactive map server - probably using shiny and leaflet.
>
Cool project
> The final data frame would look like this:
>
>
>> head(df)
> Location Var Cut Year Month Freq
> Campinas temperature 10 2010 1 11
> Campinas temperature 10 2010 2 19
> Campinas temperature 10 2010 3 30
> Campinas temperature 10 2010 4 29
> Campinas temperature 10 2010 5 31
> Campinas temperature 10 2010 6 30
>
>
> I have global raster stacks with daily data and I am counting, for each month in the raster, the number of days above certain temperature threshold. Please see below:
>
>
> library(raster)
> library(zoo)
> library(maptools)
# additional packages for dataframes manipulation
library(dplyr)
library(tidyr)
>
> # Create a rasterStack similar to my data - same dimensions and layer namesr <- raster(ncol=360, nrow=180)
> s <- stack(lapply(1:730, function(x) setValues(r, runif(ncell(r),min=0,max=30))))
> idx <- seq(as.Date("2010/1/1"), by = "day", length.out = 730)
> s <- setZ(s, idx)
> s
>
> # Define functions for 10, 15, 20 and 25 degrees - Thanks Loïc in my previous question
> fun1 <- function(x, na.rm) {
> sum(x > 10, na.rm)
> }
>
> fun2 <- function(x, na.rm) {
> sum(x > 15, na.rm)
> }
>
> fun3 <- function(x, na.rm) {
> sum(x > 20, na.rm)
> }
>
> fun4 <- function(x, na.rm) {
> sum(x > 25, na.rm)
> }
>
> # Count number of days above the threshold temperature
> days.above.10 <- zApply(s, by=as.yearmon, fun = fun1)
> days.above.15 <- zApply(s, by=as.yearmon, fun = fun2)
> days.above.20 <- zApply(s, by=as.yearmon, fun = fun3)
> days.above.25 <- zApply(s, by=as.yearmon, fun = fun4)
>
>
> Now, what I would like to do is to programmatically extract values for each location on my study area. The locations are defined as a shapefile with municipal contours of the Sao Paulo state in Brazil.
>
>
> In this example, however, just for reproducibility's sake, I will be using a world polygon. But keep in mind that in my actual data the polygons will be much smaller.
>
>
> # Import *sample* polygon data and subset only five "locations"
> data(wrld_simpl)
> locs <- subset(wrld_simpl, wrld_simpl at data$NAME %in% c("Argentina","Bolivia","Brazil","Paraguay","Uruguay"))
>
> # Plot
> plot(days.above.10,1)
> plot(locs,add=T)
>
# Extract values for all polygons
spdf <- raster::extract(days.above.10, locs, fun = median, na.rm = TRUE,
sp = TRUE)
# There is also a df = TRUE option in extract, but it returns only the
extracted raster values, without binding them with the
spatialpolygondataframe attributes. I think
# Get dataframe out of spdf
df <- spdf at data
df1 <- select(df, NAME, Jan.2010:Dec.2010)
df2 <- gather(df1, period, freq, -NAME)
df3 <- separate(df2, period, into = c('month', 'year'))
# Can you add these columns "manually" or does it need to be automated?
df3$variable <- 'temperature'
df3$cut <- 10
# If you do the same for cut == 15, etc, you can then rbind() them
>
> I feel like half of the work is done, but I am just grasping with the conversion to data frames.
>
> Based on this self-contained example I provided, what would be the best strategy to come out with a data frame per location, like this?
>
>> head(Argentina.df)
> Location Var Cut Year Month Freq
> Argentina temperature 10 2010 1 11
> Argentina temperature 10 2010 2 19
> Argentina temperature 10 2010 3 30
> Argentina temperature 10 2010 4 12
> Argentina temperature 10 2010 5 17
> Argentina temperature 10 2010 6 14
>
>
>> head(Bolivia.df)
> Location Var Cut Year Month Freq
> Bolivia temperature 10 2010 1 29
> Bolivia temperature 10 2010 2 31
> Bolivia temperature 10 2010 3 30
>
> Bolivia temperature 10 2010 4 17
> Bolivia temperature 10 2010 5 19
> Bolivia temperature 10 2010 6 12
>
> and so on.
>
> Note that "cut" refers to the temperature thresholds defined in the functions above. Each cut should come from the equivalent raster stack: days.above.10, days.above.15 and so on.
>
> I much appreciate any input.
> Greetings,
> -- Thiago V. dos Santos
>
> PhD student
> Land and Atmospheric Science
> University of Minnesota
>
> _______________________________________________
> R-sig-Geo mailing list
> R-sig-Geo at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-geo
>
More information about the R-sig-Geo
mailing list