[R] Slow function

Marc marc.moragues at gmail.com
Tue Jun 10 18:11:24 CEST 2008


Hi Jim,

This is genotype data of 170 samples. I selected subsets of SNP 
optimized for different types of germplasm. So it is a matrix with 170 
rows and 1536, 384 or 96 columns of binary data (0, 1). I have 14 of 
such matrices in a list.

x <-  list()
for (i in 1:14) {
 set.seed(i)
 x[[i]] <- matrix(sample(rep(c(1,0), 1000000), 1536*170), nrow = 170, 
ncol = 1536)
}

Thanks,
Marc.

jim holtman wrote:
> I have no idea of what your data looks like, so using random numbers 
> and only going for nr=1, after about a minute I stopped it.  Here is 
> what Rprof showed:
>  
> /cygdrive/c/perf: perl c:/perf/bin/readRprof.pl Rprof.out 1
>   0  75.8 root
>   1.   75.7 sapply
>   2. .   75.7 lapply
>   3. . .   75.7 FUN
>   4. . . .   75.6 as.dist
>   5. . . . .   75.6 distance
>   6. . . . . .   75.6 distance.default
>   7. . . . . . .   75.4 apply
>   8. . . . . . . .   73.8 FUN
>   9. . . . . . . . .   73.8 switch
>  10. . . . . . . . . .   73.8 apply
>  11. . . . . . . . . . .   63.4 FUN
>  12. . . . . . . . . . . .    6.6 !
>  12. . . . . . . . . . . .    2.8 -
>  12. . . . . . . . . . . .    2.5 any
>  12. . . . . . . . . . . .    2.2 /
>  12. . . . . . . . . . . .    1.7 sum
>  12. . . . . . . . . . . .    1.6 *
>  11. . . . . . . . . . .    2.3 aperm
>  11. . . . . . . . . . .    1.0 unlist
>   8. . . . . . . .    1.5 join
>  
> This says almost all the time is in the 'distance' function.  Try 
> running your data with 'nr' very small and see what happens.
>
> On Tue, Jun 10, 2008 at 4:49 AM, Marc <marc.moragues at gmail.com 
> <mailto:marc.moragues at gmail.com>> wrote:
>
>     Hi,
>
>     I have the following function that I want to apply to a list of 14
>     matrices (1536 x 170) of binary data:
>
>     DRes <- function(x, nr = 10000, metric = "mixed", ...) {
>      require(analogue)
>      require(ade4)
>      m <- c()
>      for (i in 1:nr) {
>       set.seed(i)
>       x1 <- x[, sample(dimnames(x)[[2]], length(x[1,])/2)]
>       x2 <- x[, !dimnames(x)[[2]] %in% dimnames(x1)[[2]]]
>       d1 <- as.dist(distance(as.data.frame(x1), method = metric))
>       d2 <- as.dist(distance(as.data.frame(x2), method = metric))
>       m[i] <- mantel.rtest(d1, d2, ...)$obs
>       mean <- mean(m)
>       std <- sd(m)
>       res <- list(mean = mean, std = std)
>      }
>      return(res)
>     }
>     bias.dres <- sapply(bias, DRes)
>
>     I run this code and after 3 hours is still running. I am on
>     Windows XP and this is my sessionInfo()
>     > sessionInfo()
>     R version 2.7.0 Patched (2008-05-02 r45580)
>     i386-pc-mingw32
>
>     locale:
>     LC_COLLATE=English_United Kingdom.1252;LC_CTYPE=English_United
>     Kingdom.1252;LC_MONETARY=English_United
>     Kingdom.1252;LC_NUMERIC=C;LC_TIME=English_United Kingdom.1252
>
>     attached base packages:
>     [1] stats     graphics  grDevices utils     datasets  methods  
>     base    
>     other attached packages:
>     [1] analogue_0.5-1 vegan_1.11-4   ade4_1.4-7
>     Any help will be very much appreciated.
>     Marc.
>
>     ______________________________________________
>     R-help at r-project.org <mailto:R-help at r-project.org> mailing list
>     https://stat.ethz.ch/mailman/listinfo/r-help
>     PLEASE do read the posting guide
>     http://www.R-project.org/posting-guide.html
>     <http://www.r-project.org/posting-guide.html>
>     and provide commented, minimal, self-contained, reproducible code.
>
>
>
>
> -- 
> Jim Holtman
> Cincinnati, OH
> +1 513 646 9390
>
> What is the problem you are trying to solve?



More information about the R-help mailing list