[R] multicore and mclapply problem in calculation server

Juan Antonio Balbuena j.a.balbuena at uv.es
Wed Dec 18 09:28:41 CET 2013


   Hello
   I am using package multicore for parallel computing in a Altix UltraViolet
   1000 server with 64 CPUs and 960 GB of RAM memory. Access is managed by
   means of a SGE queue system. This is the first time I am using parallel
   computing and my experience with supercomputers is quite limited. So any
   help will be much, much appreciated.
   My experiment consists of a number of runs (N.runs) each involving a number
   of permutations (N. perms). (An excerpt of the code is included below.) The
   permutations are very time consuming and I am using mclapply to distribute
   the job among a given number of cpus (usually 12 to 24). The problem is that
   the system administrators notice that threads keep increasing as the program
   is executed to the point that they compromise the functioning of the whole
   system and have to abort the job.
   I have tried to specify in the bash file sent to the queue RAM limits (using
   ulimit) and the number of cpus to be used but it doesn't help.
   An example of the code I am using may be
   #LOAD LIBRARIES NEEDED:
   library(ape)
   library(phytools)
   library(phangorn)
   library(multicore)
   #
   .... SOME 100 LINES HERE DEVOTED TO DEFINE FUNCTIONS -- OMITTED FOR BREVITY
   #######
   body  <-  function  (N.perm) {           #MAIN BODY = 1 RUN -- IT PUTS
   FUNCTIONS TOGETHER
     HP <- HPrandomizer(NH,NP,N.assoc)
     linH = readLines(conH, n= N.perm)
     linP = readLines(conP, n= N.perm)
     stat.matrix <- matrix((rep(NA, 6*N.perm)), ,6)
     #
     wrapper <- function (x) {           # THIS FUCNTION IS SUPPOSED TO BE
   PARALLELIZED (SEE BELOW)
       treeH <- read.tree(text=linH[x])
       treeP <- read.tree(text=linP[x])
       mrcaL <- MRCALink.simul (treeH, treeP, HP)
       stat.matrix[x,] <- three.stat(mrcaL)
       }
     x <- c(1: length(linH))            #NOTE THAT linH IS SUPPOSED TO BE =
   N.perm
     stat.matrix <- do.call(rbind, mclapply(x, wrapper, mc.cores= 6)) # USE OF
   MCLAPPLY
     Pstat <- apply(stat.matrix, 2, rank)[1,]/length(linH)
     write(c(stat.matrix[1,], Pstat), file =
   "/scratch/ba/balbuena/30H30P40.txt", sep ="\t", append =TRUE,ncolumns=12)
   }
   ptm <- proc.time()  # THE PROGRAM STARTS HERE
   NH= 30
   NP= 30
   N.assoc= 40
   N.runs = 1000
   N.perm = 999
   conH = file("/scratch/ba/balbuena/1MH_30.tre", open="rt") # READS TEXT DATA
   FROM EXTERNAL FILE
   conP  = file("/scratch/ba/balbuena/1MP_30.tre", open="rt") #    "    "
   "    "     "       "
   replicate (N.runs, body(N.perm))  # LOOPING body NUMBER OF RUNS
   close(conH)
   close(conP)
   proc.time() - ptm
   Than you very much for your attention
   Juan A. Balbuena

   --

   Dr. Juan A. Balbuena
   Marine Zoology Unit
   Cavanilles Institute of Biodiversity and Evolutionary Biology
   University of
   Valencia
   [1]http://www.uv.es/~balbuena
   P.O. Box 22085
   [2]http://www.uv.es/cavanilles/zoomarin/index.htm
   46071 Valencia, Spain
   [3]http://cetus.uv.es/mullpardb/index.html
   e-mail: [4]j.a.balbuena at uv.es    tel. +34 963 543 658    fax +34 963 543 733
   ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
   NOTE! For shipments by EXPRESS COURIER use the following street address:
   C/ Catedrático José Beltrán 2, 46980 Paterna (Valencia), Spain.
   ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++

References

   1. http://www.uv.es/%7Ebalbuena
   2. http://www.uv.es/cavanilles/zoomarin/index.htm
   3. http://cetus.uv.es/mullpardb/index.html
   4. mailto:j.a.balbuena at uv.es


More information about the R-help mailing list