[R] : Quantile and rowMean from multiple files in a folder
Zilefac Elvis
zilefacelvis at yahoo.com
Thu Apr 17 07:31:50 CEST 2014
Hi AK,
Thanks very much.
Atem.
On Wednesday, April 16, 2014 9:32 PM, arun <smartpink111 at yahoo.com> wrote:
Hi,
Use this code after `lst2`.
lapply(seq_along(lst2), function(i) {
lstN <- lapply(lst2[[i]], function(x) {
datN <- as.data.frame(matrix(NA, nrow = 101, ncol = length(names1), dimnames = list(NULL,
names1)))
x1 <- x[, -1]
qt <- numcolwise(function(y) quantile(y, seq(0, 1, by = 0.01), na.rm = TRUE))(x1)
datN[, match(names(x1), names(datN))] <- qt
datN
})
arr1 <- array(unlist(lstN), dim = c(dim(lstN[[1]]), length(lstN)), dimnames = list(NULL,
names1))
res <- rowMeans(arr1, dims = 2, na.rm = TRUE)
colnames(res) <- gsub(" ", "_", colnames(res))
res1 <- data.frame(Percentiles = paste0(seq(0, 100, by = 1), "%"), res, stringsAsFactors = FALSE)
write.csv(res1, paste0(paste(getwd(), "final", paste(names(lst1)[[i]], "Quantile",
sep = "_"), sep = "/"), ".csv"), row.names = FALSE, quote = FALSE)
})
ReadOut1 <- lapply(list.files(recursive = TRUE)[grep("Quantile", list.files(recursive = TRUE))],
function(x) read.csv(x, header = TRUE, stringsAsFactors = FALSE))
sapply(ReadOut1, function(x) dim(x))
lstNew <- simplify2array(ReadOut1)
nrow(lstNew)
#[1] 258
dir.create("Indices")
lapply(2:nrow(lstNew), function(i) {
dat1 <- data.frame(Percentiles = lstNew[1], do.call(cbind,
lstNew[i, ]), stringsAsFactors = FALSE)
colnames(dat1) <- c("Percentiles", paste(names(lst2), rep(rownames(lstNew)[i],
length(lst2)), sep = "_"))
write.csv(dat1, paste0(paste(getwd(), "Indices", gsub(" ", "_", rownames(lstNew)[i]),
sep = "/"), ".csv"), row.names = FALSE, quote = FALSE)
})
## Output2:
ReadOut2 <- lapply(list.files(recursive = TRUE)[grep("Indices", list.files(recursive = TRUE))],
function(x) read.csv(x, header = TRUE, stringsAsFactors = FALSE))
names(ReadOut2) <- gsub(".*\\/(.*)\\.csv","\\1",list.files(recursive = TRUE)[grep("Indices", list.files(recursive = TRUE))])
ReadOut2$pint_DJF[1:3,1:3]
# Percentiles G100_pint_DJF G101_pint_DJF
#1 0% 0.982001 1.020892
#2 1% 1.005563 1.039288
#3 2% 1.029126 1.057685
any(is.na(ReadOut2$pint_DJF))
[1] FALSE
A.K.
On Wednesday, April 16, 2014 12:34 PM, Zilefac Elvis <zilefacelvis at yahoo.com> wrote:
Hi AK,
I tried the updated "Quantilecode.txt". It works well but when I open the files in "Indices", I find some columns filled with NAs. This should not be the case given that I am working with simulations and there are no missing values in the process. The ##not correct section yielded no NAs. Check for example, pint_..._DJF in "Indices".
Let be be sure we are in the same page. I removed the ##not correct section of the code, ran the code from beginning to end; Q1 and then Q2. My results are found in the "Indices" folder.
Thanks,
Atem.
More information about the R-help
mailing list