[R-sig-Geo] Generating uncertainty maps with Monte-Carlo simulation and large rasters
Mercedes Román
mercetadzio at gmail.com
Tue Oct 11 13:37:35 CEST 2016
Dear list members,
I am trying to generate uncertainty maps of available water capacity
predicted with a pedotransfer function via Monte-Carlo simulation. I have
the raster layers with the mean and SD of the input variables, so I can
sample from their distributions (I ignore the correlations among variables,
and consider their distributions as univariate in this exercise). All I
want is to apply Monte-Carlo simulation for each cell of the raster. I
found very useful some of the previous postings on this subject with the
mc2d package (
http://r-sig-geo.2731867.n2.nabble.com/parallel-raster-processing-with-calc-and-mc2d-monte-carlo-simulation-td7587901.html#a7587902).
Finally, I found that it was quicker to use the package lhs, generate a
latin hypercube sample from a uniform distribution, and then obtain the
sample for each of my variables with qnorm. I give an example of my
function bellow. The PTF is a bit long; you can ignore most of it and just
look at the beginning, where I sample from the distributions, and at the
end (the output layers with the prediction interval limits).
Applied to a small raster it is pretty quick. The problem is that I have to
apply the function to 300 raster stacks of 2805374 cells each, and that is
going to take a long time even with parallel computing using clusterR (a
couple of weeks with a computer like mine).
I imagine some of you have dealt with similar problems. Do you have any
advice for sampling from probability distributions in monte-carlo
simulations from big raster files? Or any advice for generating uncertainty
maps?
I believe much of the time is due to accessing the data (getValues). So I
tried to do an alternative function, by extracting the data of the raster
stack first into a dataframe, but that did not seem to save time either.
Thank you in advance for all your help.
Kind regards,
Mercedes Roman
require(raster)
require(sp)
require(rgdal)
library(lhs)
library(parallel)
library(doParallel)
library(snow)
### Generate example rasters
r <- raster(nrow=10, ncol=10)
set.seed(23850)
s1 <- setValues(r,runif(n = 100, min = 0, max=1000))
s2 <- setValues(r,runif(n = 100, min = 0, max=100))
s3 <- setValues(r,runif(n = 100, min = 0, max=1000))
s4 <- setValues(r,runif(n = 100, min = 0, max=20))
s5 <- setValues(r,runif(n = 100, min = 0, max=300))
s6 <- setValues(r,runif(n = 100, min = 0, max=20))
s7 <- setValues(r,runif(n = 100, min = 0, max=200))
s8 <- setValues(r,runif(n = 100, min = 0, max=20))
s9 <- setValues(r,runif(n = 100, min = 0, max=14))
s10 <- setValues(r,runif(n = 100, min = 0, max=1))
s<-stack(s1, s2,s3,s4,s5,s6,s7,s8,s9,s10)
plot(s)
# Write the PTF function, sample with package lhs
func_lhs <- function(x){
###How many simulations we want?
simulations <- 1000
sLHS <-randomLHS(n=simulations,k=5) ### Draws a Latin Hypercube Sample
from a set of uniform distribution
### Transform the values into normal distribution with the mean and sd
of each variable
### sand
sim_sand <- qnorm(sLHS[,1], mean = x[[1]], sd = x[[2]])
### constrained to be between 0 and 1000
sim_sand <- ifelse(sim_sand<0, 0, ifelse(sim_sand>1000, 1000, sim_sand))
### silt
sim_silt <- qnorm(sLHS[,2], mean = x[[3]], sd = x[[4]])
### constrained to be between 0 and 1000
sim_silt <- ifelse(sim_silt<0, 0, ifelse(sim_silt>1000, 1000, sim_silt))
### SOC
sim_soc <- qnorm(sLHS[,3], mean = x[[5]], sd = x[[6]])
### constrained to be between 0 and 1000
sim_soc <- ifelse(sim_soc<0, 0, ifelse(sim_soc>1000, 1000, sim_soc))
### CEC
sim_cec <- qnorm(sLHS[,4], mean = x[[7]], sd = x[[8]])
### constrained to be > 0
sim_cec <- ifelse(sim_cec<0, 0, sim_cec)
### pH
sim_ph <- qnorm(sLHS[,5], mean = x[[9]], sd = x[[10]])
### constrained to be betweeen 0 and 14
sim_ph <- ifelse(sim_ph<0, 0, ifelse(sim_ph>14, 14, sim_ph))
### Apply the PTF
### Define the functions
### Because from the simulations we cannot guarantee that the sum of
texture fractions is 1000, we calculate clay as the difference 1000 - (sand
+ silt)
sim_clay <- 1000 - (sim_sand + sim_silt)
### What if silt and sand sum > 1000?
sim_clay <- ifelse(sim_clay <0, 0, sim_clay)
### To calculate the first parameter, SMres, the residual water content
(cm³ cm-3), we need sand content, as %
### GSM has sand as g/kg - So the 2% by Toth et al is 20 g/kg here
SMres <- ifelse(is.na(sim_sand), NA, ifelse(sim_sand >= 20, 0.041,
0.179))
### Now for SMs, the saturated water content (cm³ cm-3)
### SMs = 0.5056 - 0.1437 * (1/(OC+1)) + 0.0004152 * Si
SMsat <- 0.5056 - (0.1437 * (1/((0.1* sim_soc)+1))) + (0.0004152 *
sim_silt * 0.1)
#### the parameter alpha
#### log10(alpha) = -1.3050 - 0.0006123 * Si - 0.009810 * Cl + 0.07611
* (1/(OC+1)) - 0.0004508 * Si * Cl + 0.03472 * Cl * (1/(OC+1)) - 0.01226 *
Si * (1/(OC+1))
logAlpha <- -1.3050 - (0.0006123 * sim_silt * 0.1) - (0.009810 *
sim_clay * 0.1) + (0.07611 * (1/((0.1*sim_soc)+1))) - (0.0004508 * sim_silt
* sim_clay *0.01) + (0.03472 * sim_clay * 0.1 * (1/((sim_soc*0.1)+1))) -
(0.01226 * sim_silt * 0.1 * (1/((sim_soc*0.1)+1)))
alpha <- 10^logAlpha
### the parameter n
### log10(n-1) = 0.01516 - 0.005775 * (1/(OC+1)) - 0.24885 * log10(CEC)
- 0.01918 * Cl - 0.0005052 * Si - 0.007544 * pH2
### - 0.02159 * Cl * (1/(OC+1)) + 0.01556 * Cl * log10(CEC) + 0.01477 *
(1/(OC+1)) * pH2 + 0.0001121 * Si * Cl - 0.33198 * (1/(OC+1)) * log10(CEC)
logN_1 <- 0.01516 - (0.005775 * (1/((sim_soc * 0.1)+1))) - (0.24885 *
log10(sim_cec)) - (0.01918 * sim_clay * 0.1) - (0.0005052 * sim_silt * 0.1)
- (0.007544 * (sim_ph^2)) - (0.02159 * 0.1 * sim_clay * (1/((0.1 *
sim_soc)+1))) + (0.01556 * 0.1 * sim_clay * log10(sim_cec)) + (0.01477 *
(1/((0.1 * sim_soc)+1)) * (sim_ph^2)) + (0.0001121 * 0.01 * sim_silt *
sim_clay) - (0.33198 * (1/((0.1*sim_soc)+1)) * log10(sim_cec))
n <- (10^logN_1)+1
### define the soil matric potential (cm of water column)
#h <- 101.9716 ### soil matric potential in cm of water column
----- 10 kPa equivalent to 101.9716 cmH2O
### We fit the Mualem-van Genuchten equation for theta at field capacity
smFC <- SMres +((SMsat-SMres)/((1+((alpha*101.9716)^n))^(1-(1/n))))
smFC <- ifelse(smFC < 0, 0, smFC)
##### Soil moisture at permanent wilting point
##### thWP = 0.09878 + 0.002127* Cl - 0.0008366 * Si - 0.07670
*(1/(OC+1)) + 0.00003853 * Si * Cl + 0.002330 * Cl * (1/(OC+1)) + 0.0009498
* Si * (1/(OC+1))
smWP <- 0.09878 + (0.002127*sim_clay*0.1) - (0.0008366*sim_silt*0.1) -
(0.07670 *(1/((sim_soc*0.1)+1))) + (0.00003853 * sim_silt*sim_clay*0.01) +
(0.002330 * sim_clay*0.1 * (1/((sim_soc*0.1)+1))) + (0.0009498 *
sim_silt*0.1 * (1/((sim_soc*0.1)+1)))
smWP <- ifelse(smWP < 0, 0, smWP)
### Calculate the AWC as difference of both
AWC <- smFC - smWP
### what if smWP > smFC?
AWC <- ifelse(AWC<0,0,AWC)
### Calculate the PI
AWC_lower <-quantile(AWC, probs =.05, na.rm=TRUE)
AWC_upper <-quantile(AWC, probs =.95, na.rm=TRUE)
smFC_lower <-quantile(smFC, probs =.05, na.rm=TRUE)
smFC_upper <-quantile(smFC, probs =.95, na.rm=TRUE)
smWP_lower <-quantile(smWP, probs =.05, na.rm=TRUE)
smWP_upper <-quantile(smWP, probs =.95, na.rm=TRUE)
all_results <- c(AWC_lower, AWC_upper, smFC_lower, smFC_upper,
smWP_lower, smWP_upper ) ### Returns a nlayers raster stack. Ordered as
smFC, smWP, and AWC
return(all_results)
}
### Apply function with clusterR
ff <- function(x) calc(x, func_lhs)
beginCluster(n = 8, type='SOCK')
little_test <- clusterR(s, ff, export='func_lhs')
endCluster()
[[alternative HTML version deleted]]
More information about the R-sig-Geo
mailing list