[R] : Quantile and rowMean from multiple files in a folder

Thu Apr 17 07:31:50 CEST 2014

Hi AK,
Thanks very much.
Atem.

On Wednesday, April 16, 2014 9:32 PM, arun <smartpink111 at yahoo.com> wrote:
Hi,
Use this code after `lst2`.
lapply(seq_along(lst2), function(i) {
    lstN <- lapply(lst2[[i]], function(x) {
        datN <- as.data.frame(matrix(NA, nrow = 101, ncol = length(names1), dimnames = list(NULL, 
            names1)))
        x1 <- x[, -1]
        qt <- numcolwise(function(y) quantile(y, seq(0, 1, by = 0.01), na.rm = TRUE))(x1)
        datN[, match(names(x1), names(datN))] <- qt
        datN
    })
    arr1 <- array(unlist(lstN), dim = c(dim(lstN[[1]]), length(lstN)), dimnames = list(NULL, 
        names1))
    res <- rowMeans(arr1, dims = 2, na.rm = TRUE)
    colnames(res) <- gsub(" ", "_", colnames(res))
    res1 <- data.frame(Percentiles = paste0(seq(0, 100, by = 1), "%"), res, stringsAsFactors = FALSE)
    write.csv(res1, paste0(paste(getwd(), "final", paste(names(lst1)[[i]], "Quantile", 
        sep = "_"), sep = "/"), ".csv"), row.names = FALSE, quote = FALSE)
})

ReadOut1 <- lapply(list.files(recursive = TRUE)[grep("Quantile", list.files(recursive = TRUE))], 
    function(x) read.csv(x, header = TRUE, stringsAsFactors = FALSE))
sapply(ReadOut1, function(x) dim(x)) 

lstNew <- simplify2array(ReadOut1)
nrow(lstNew)
#[1] 258
dir.create("Indices")
lapply(2:nrow(lstNew), function(i) {
    dat1 <- data.frame(Percentiles = lstNew[1], do.call(cbind, 
        lstNew[i, ]), stringsAsFactors = FALSE)
    colnames(dat1) <- c("Percentiles", paste(names(lst2), rep(rownames(lstNew)[i], 
        length(lst2)), sep = "_"))
    write.csv(dat1, paste0(paste(getwd(), "Indices", gsub(" ", "_", rownames(lstNew)[i]), 
        sep = "/"), ".csv"), row.names = FALSE, quote = FALSE)
})

## Output2:
ReadOut2 <- lapply(list.files(recursive = TRUE)[grep("Indices", list.files(recursive = TRUE))], 
    function(x) read.csv(x, header = TRUE, stringsAsFactors = FALSE))
names(ReadOut2) <- gsub(".*\\/(.*)\\.csv","\\1",list.files(recursive = TRUE)[grep("Indices", list.files(recursive = TRUE))])
ReadOut2$pint_DJF[1:3,1:3]
#  Percentiles G100_pint_DJF G101_pint_DJF
#1          0%      0.982001      1.020892
#2          1%      1.005563      1.039288
#3          2%      1.029126      1.057685
any(is.na(ReadOut2$pint_DJF))
 [1] FALSE 
A.K.

On Wednesday, April 16, 2014 12:34 PM, Zilefac Elvis <zilefacelvis at yahoo.com> wrote:
Hi AK,
I tried the updated "Quantilecode.txt". It works well but when I open the files in "Indices", I find some columns filled with NAs. This should not be the case given that I am working with simulations and there are no missing values in the process. The ##not correct section yielded no NAs. Check for example, pint_..._DJF in "Indices".

Let be be sure we are in the same page. I removed the ##not correct section of the code, ran the code from beginning to end; Q1 and then Q2. My results are found in the "Indices" folder.

Thanks,
Atem.