[R] Increasing computiation time per column using lapply
Henning Redestig
redestig at mpimp-golm.mpg.de
Mon Oct 18 15:51:19 CEST 2004
Hi,
Would be very glad for help on this problem. Using this code:
temp<-function(x, bins, tot) {
return(as.numeric(lapply(split(x, bins), wtest, tot)));
}
wtest <- function(x, y) {
return(wilcox.test(x,y)$p.value);
}
rs <- function(x, bins) {
binCount <- length(split(x[,1], bins));
tot <- as.numeric(x);
result<-matrix(apply(x, 2, temp, bins, tot),
nrow=binCount, byrow=F);
rownames(result)<-names(split(x[,1], bins));
colnames(result)<-colnames(x);
return(result);
}
where x is a matrix and bins is the grouping vector which can be used to
split every column in x I get
>rs(x, bins)
to take ~100 s to execute if x has 22000 rows, 2 columns and bins split
these in to 226 arrays of similar length. Thats all right but, if I
instead increase to 3 columns it takes ~300 s and with 50 columns it
takes > 13 h to execute. I can not understand why execution time doesnt
increase linearly with the amount of columns. Memory status is all fine
and I never need to start swapping.
I tried to remove the temp function and use a for-loop to iterate over
the columns instead of using apply but it does not solve my problem.
Thanx!
/Henning, redestig at mpimp-golm.mpg.de
More information about the R-help
mailing list