[R] : Quantile and rowMean from multiple files in a folder

Thu Apr 17 18:23:17 CEST 2014

Hi AK,
Thanks very much for the updated code.
My simulated results are even more consistent with observations after apply the updated version of the code.

Cheers,
Atem.

On Wednesday, April 16, 2014 11:31 PM, Zilefac Elvis <zilefacelvis at yahoo.com> wrote:
Hi AK,
Thanks very much.
Atem.

On Wednesday, April 16, 2014 9:32 PM, arun <smartpink111 at yahoo.com> wrote:
Hi,
Use this code after `lst2`.
lapply(seq_along(lst2), function(i) {
    lstN <- lapply(lst2[[i]], function(x) {
        datN <- as.data.frame(matrix(NA, nrow = 101, ncol = length(names1), dimnames = list(NULL, 
            names1)))
        x1 <- x[, -1]
        qt <- numcolwise(function(y) quantile(y, seq(0, 1, by = 0.01), na.rm = TRUE))(x1)
        datN[, match(names(x1), names(datN))] <- qt
        datN
    })
    arr1 <- array(unlist(lstN), dim = c(dim(lstN[[1]]), length(lstN)), dimnames = list(NULL, 
        names1))
    res <- rowMeans(arr1, dims = 2, na.rm = TRUE)
    colnames(res) <- gsub(" ", "_", colnames(res))
    res1 <- data.frame(Percentiles = paste0(seq(0, 100, by = 1), "%"), res, stringsAsFactors = FALSE)
    write.csv(res1, paste0(paste(getwd(), "final", paste(names(lst1)[[i]], "Quantile", 
        sep = "_"), sep = "/"), ".csv"), row.names = FALSE, quote = FALSE)
})

ReadOut1 <- lapply(list.files(recursive = TRUE)[grep("Quantile", list.files(recursive = TRUE))], 
    function(x) read.csv(x, header = TRUE, stringsAsFactors = FALSE))
sapply(ReadOut1, function(x) dim(x)) 

lstNew <- simplify2array(ReadOut1)
nrow(lstNew)
#[1] 258
dir.create("Indices")
lapply(2:nrow(lstNew), function(i) {
    dat1 <- data.frame(Percentiles = lstNew[1], do.call(cbind, 
        lstNew[i, ]), stringsAsFactors = FALSE)
    colnames(dat1) <- c("Percentiles", paste(names(lst2), rep(rownames(lstNew)[i], 
        length(lst2)), sep = "_"))
    write.csv(dat1, paste0(paste(getwd(), "Indices", gsub(" ", "_", rownames(lstNew)[i]), 
        sep = "/"), ".csv"), row.names = FALSE, quote = FALSE)
})

## Output2:
ReadOut2 <- lapply(list.files(recursive = TRUE)[grep("Indices", list.files(recursive = TRUE))], 
    function(x) read.csv(x, header = TRUE, stringsAsFactors = FALSE))
names(ReadOut2) <- gsub(".*\\/(.*)\\.csv","\\1",list.files(recursive = TRUE)[grep("Indices", list.files(recursive = TRUE))])
ReadOut2$pint_DJF[1:3,1:3]
#  Percentiles G100_pint_DJF G101_pint_DJF
#1          0%      0.982001      1.020892
#2          1%      1.005563      1.039288
#3          2%      1.029126      1.057685
any(is.na(ReadOut2$pint_DJF))
 [1] FALSE 
A.K.

On Wednesday, April 16, 2014 12:34 PM, Zilefac Elvis <zilefacelvis at yahoo.com> wrote:
Hi AK,
I tried the updated "Quantilecode.txt". It works well but when I open the files in "Indices", I find some columns filled with NAs. This should not be the case given that I am working with simulations and there are no missing values in the process. The ##not correct section yielded no NAs. Check for example, pint_..._DJF in "Indices".

Let be be sure we are in the same page. I removed the ##not correct section of the code, ran the code from beginning to end; Q1 and then Q2. My results are found in the "Indices" folder.

Thanks,
Atem.