[R-sig-Geo] Generating uncertainty maps with Monte-Carlo simulation and large rasters

Mercedes Román mercetadzio at gmail.com
Tue Oct 11 13:37:35 CEST 2016


Dear list members,



I am trying to generate uncertainty maps of available water capacity
predicted with a pedotransfer function via Monte-Carlo simulation. I have
the raster layers with the mean and SD of the input variables, so I can
sample from their distributions (I ignore the correlations among variables,
and consider their distributions as univariate in this exercise). All I
want is to apply Monte-Carlo simulation for each cell of the raster. I
found very useful some of the previous postings on this subject with the
mc2d package (
http://r-sig-geo.2731867.n2.nabble.com/parallel-raster-processing-with-calc-and-mc2d-monte-carlo-simulation-td7587901.html#a7587902).
Finally, I found that it was quicker to use the package lhs, generate a
latin hypercube sample from a uniform distribution, and then obtain the
sample for each of my variables with qnorm. I give an example of my
function bellow. The PTF is a bit long; you can ignore most of it and just
look at the beginning, where I sample from the distributions, and at the
end (the output layers with the prediction interval limits).



Applied to a small raster it is pretty quick. The problem is that I have to
apply the function to 300 raster stacks of 2805374 cells each, and that is
going to take a long time even with parallel computing using clusterR (a
couple of weeks with a computer like mine).



I imagine some of you have dealt with similar problems. Do you have any
advice for sampling from probability distributions in monte-carlo
simulations from big raster files? Or any advice for generating uncertainty
maps?



I believe much of the time is due to accessing the data (getValues). So I
tried to do an alternative function, by extracting the data of the raster
stack first into a dataframe, but that did not seem to save time either.



Thank you in advance for all your help.



Kind regards,



Mercedes Roman







require(raster)

require(sp)

require(rgdal)

library(lhs)

library(parallel)

library(doParallel)

library(snow)





### Generate example rasters

r <- raster(nrow=10, ncol=10)

set.seed(23850)



s1 <- setValues(r,runif(n = 100, min = 0, max=1000))

s2 <- setValues(r,runif(n = 100, min = 0, max=100))

s3 <- setValues(r,runif(n = 100, min = 0, max=1000))

s4 <- setValues(r,runif(n = 100, min = 0, max=20))

s5 <- setValues(r,runif(n = 100, min = 0, max=300))

s6 <- setValues(r,runif(n = 100, min = 0, max=20))

s7 <- setValues(r,runif(n = 100, min = 0, max=200))

s8 <- setValues(r,runif(n = 100, min = 0, max=20))

s9 <- setValues(r,runif(n = 100, min = 0, max=14))

s10 <- setValues(r,runif(n = 100, min = 0, max=1))



s<-stack(s1, s2,s3,s4,s5,s6,s7,s8,s9,s10)

plot(s)



# Write the PTF function, sample with package lhs

func_lhs <- function(x){



    ###How many simulations we want?

    simulations <- 1000

    sLHS <-randomLHS(n=simulations,k=5) ### Draws a Latin Hypercube Sample
from a set of uniform distribution



    ### Transform the values into normal distribution with the mean and sd
of each variable

    ### sand

    sim_sand <- qnorm(sLHS[,1], mean = x[[1]], sd = x[[2]])

    ### constrained to be between 0 and 1000

    sim_sand <- ifelse(sim_sand<0, 0, ifelse(sim_sand>1000, 1000, sim_sand))

    ### silt

    sim_silt <- qnorm(sLHS[,2], mean = x[[3]], sd = x[[4]])

    ### constrained to be between 0 and 1000

    sim_silt <- ifelse(sim_silt<0, 0, ifelse(sim_silt>1000, 1000, sim_silt))

    ### SOC

    sim_soc <- qnorm(sLHS[,3], mean = x[[5]], sd = x[[6]])

    ### constrained to be between 0 and 1000

    sim_soc <- ifelse(sim_soc<0, 0, ifelse(sim_soc>1000, 1000, sim_soc))

    ### CEC

    sim_cec <- qnorm(sLHS[,4], mean = x[[7]], sd = x[[8]])

    ### constrained to be > 0

    sim_cec <- ifelse(sim_cec<0, 0, sim_cec)

    ### pH

    sim_ph <- qnorm(sLHS[,5], mean = x[[9]], sd = x[[10]])

    ### constrained to be betweeen 0 and 14

    sim_ph <- ifelse(sim_ph<0, 0, ifelse(sim_ph>14, 14, sim_ph))



    ### Apply the PTF

    ### Define the functions



    ### Because from the simulations we cannot guarantee that the sum of
texture fractions is 1000, we calculate clay as the difference 1000 - (sand
+ silt)

    sim_clay <- 1000 - (sim_sand + sim_silt)

    ### What if silt and sand sum > 1000?

    sim_clay <- ifelse(sim_clay <0, 0, sim_clay)



    ### To calculate the first parameter, SMres, the residual water content
(cm³ cm-3), we need sand content, as %

    ### GSM has sand as g/kg - So the 2% by Toth et al is 20 g/kg here

    SMres <- ifelse(is.na(sim_sand), NA, ifelse(sim_sand >= 20, 0.041,
0.179))



    ### Now for SMs, the saturated water content (cm³ cm-3)

    ### SMs = 0.5056 - 0.1437 * (1/(OC+1)) + 0.0004152 * Si

    SMsat <- 0.5056 - (0.1437 * (1/((0.1* sim_soc)+1))) + (0.0004152 *
sim_silt * 0.1)



    #### the parameter alpha

    #### log10(alpha) = -1.3050 - 0.0006123 * Si - 0.009810 * Cl + 0.07611
* (1/(OC+1)) - 0.0004508 * Si * Cl + 0.03472 * Cl * (1/(OC+1)) - 0.01226 *
Si * (1/(OC+1))

    logAlpha <- -1.3050 - (0.0006123 * sim_silt * 0.1) - (0.009810 *
sim_clay * 0.1) + (0.07611 * (1/((0.1*sim_soc)+1))) - (0.0004508 * sim_silt
* sim_clay *0.01) + (0.03472 * sim_clay * 0.1 * (1/((sim_soc*0.1)+1))) -
(0.01226 * sim_silt * 0.1 * (1/((sim_soc*0.1)+1)))

    alpha <-  10^logAlpha



    ### the parameter n

    ### log10(n-1) = 0.01516 - 0.005775 * (1/(OC+1)) - 0.24885 * log10(CEC)
- 0.01918 * Cl - 0.0005052 * Si - 0.007544 * pH2

    ### - 0.02159 * Cl * (1/(OC+1)) + 0.01556 * Cl * log10(CEC) + 0.01477 *
(1/(OC+1)) * pH2 + 0.0001121 * Si * Cl - 0.33198 * (1/(OC+1)) * log10(CEC)

    logN_1 <- 0.01516 - (0.005775 * (1/((sim_soc * 0.1)+1))) - (0.24885 *
log10(sim_cec)) - (0.01918 * sim_clay * 0.1) - (0.0005052 * sim_silt * 0.1)
- (0.007544 * (sim_ph^2)) - (0.02159 * 0.1 * sim_clay * (1/((0.1 *
sim_soc)+1))) + (0.01556 * 0.1 * sim_clay * log10(sim_cec)) + (0.01477 *
(1/((0.1 * sim_soc)+1)) * (sim_ph^2)) + (0.0001121 * 0.01 * sim_silt *
sim_clay) - (0.33198 * (1/((0.1*sim_soc)+1)) * log10(sim_cec))

    n <- (10^logN_1)+1



    ### define the soil matric potential (cm of water column)

    #h <- 101.9716       ### soil matric potential in cm of water column
----- 10 kPa equivalent to 101.9716 cmH2O



    ### We fit the Mualem-van Genuchten equation for theta at field capacity

    smFC <- SMres +((SMsat-SMres)/((1+((alpha*101.9716)^n))^(1-(1/n))))

    smFC <- ifelse(smFC < 0, 0, smFC)



    ##### Soil moisture at permanent wilting point

    ##### thWP =  0.09878 + 0.002127* Cl - 0.0008366 * Si - 0.07670
*(1/(OC+1)) + 0.00003853 * Si * Cl + 0.002330 * Cl * (1/(OC+1)) + 0.0009498
* Si * (1/(OC+1))

    smWP <- 0.09878 + (0.002127*sim_clay*0.1) - (0.0008366*sim_silt*0.1) -
(0.07670 *(1/((sim_soc*0.1)+1))) + (0.00003853 * sim_silt*sim_clay*0.01) +
(0.002330 * sim_clay*0.1 * (1/((sim_soc*0.1)+1))) + (0.0009498 *
sim_silt*0.1 * (1/((sim_soc*0.1)+1)))

    smWP <- ifelse(smWP < 0, 0, smWP)



    ### Calculate the AWC as difference of both

    AWC <- smFC - smWP

    ### what if smWP > smFC?

    AWC <- ifelse(AWC<0,0,AWC)



    ### Calculate the PI

    AWC_lower <-quantile(AWC, probs =.05, na.rm=TRUE)

    AWC_upper <-quantile(AWC, probs =.95, na.rm=TRUE)



    smFC_lower <-quantile(smFC, probs =.05, na.rm=TRUE)

    smFC_upper <-quantile(smFC, probs =.95, na.rm=TRUE)



    smWP_lower <-quantile(smWP, probs =.05, na.rm=TRUE)

    smWP_upper <-quantile(smWP, probs =.95, na.rm=TRUE)



    all_results <- c(AWC_lower, AWC_upper, smFC_lower, smFC_upper,
smWP_lower, smWP_upper )  ### Returns a nlayers raster stack. Ordered as
smFC, smWP, and AWC



    return(all_results)

}





### Apply function with clusterR

ff <- function(x) calc(x, func_lhs)



beginCluster(n = 8, type='SOCK')

little_test <- clusterR(s, ff, export='func_lhs')

endCluster()

	[[alternative HTML version deleted]]



More information about the R-sig-Geo mailing list