[BioC] QuasR on Linux Cluster

Mon Oct 21 15:09:49 CEST 2013

Thank you Michael,
My bad, I am not able to find the QuasR_log at the moment. Anyway the last
step was the .sam file. QuasR was not proceeding in converting the .sam file
to a .bam file.
In attachment some other info on the running job before death.
Those refer to a case where cl<- makeCluster(1).

I run your test and I got:
> library(parallel)
> cl<- makeCluster(detectCores())
> info<- parLapply(cl, seq_along(cl), function(i) Sys.info())
> info
[[1]]
                             sysname                              release
                             "Linux"                 "2.6.18-348.3.1.el5"
                             version                             nodename
"#1 SMP Tue Mar 5 13:19:32 EST 2013"                         "ccwsge0053"
                             machine                                login
                            "x86_64"                            "unknown"
                                user                       effective_user
                          "uborello"                           "uborello"

The same for the 32 nodes.

Then I run:
> library(parallel)
> type <- if (exists("mcfork", mode="function")) "FORK" else "PSOCK"
> type
[1] "PSOCK"
> cores <- getOption("mc.cores", detectCores())
> cl <- makeCluster(cores, type=type)
> cl
socket cluster with 32 nodes on host 'localhost'
> results <- parLapply(cl, 1:100, sqrt)
> sum(unlist(results))
[1] 671.4629
> stopCluster(cl)

I don't know if this could help.

Any suggestions?

Ugo

> From: Michael Stadler <michael.stadler at fmi.ch>
> Date: Mon, 21 Oct 2013 11:30:27 +0200
> To: <bioconductor at r-project.org>
> Subject: Re: [BioC] QuasR on Linux Cluster
> 
> Hi Ugo,
> 
> On 18.10.2013 13:56, Ugo Borello wrote:> Hi all,
>> I am trying to use QuasR on a Linux Cluster:1 machine/multiple cores.
>> 
>> I run:
>> library(QuasR)
>> library(BSgenome.Mmusculus.UCSC.mm10)
>> 
>> cl <- makeCluster(1)
>> 
>> sampleFile <- "sampleFile.txt"
>> 
>> genomeName <- "BSgenome.Mmusculus.UCSC.mm10"
>> 
>> proj <- qAlign(sampleFile, genome= genomeName, splicedAlignment=TRUE,
>> clObj=cl)
>> 
>> And I get
>>> proj <- qAlign(sampleFile, genome= genomeName, splicedAlignment=TRUE,
>> clObj=cl)
>> alignment files missing - need to:
>>     create 1 genomic alignment(s)
>> Testing the compute nodes...OK
>> Loading QuasR on the compute nodes...OK
>> Available cores:
>> nodeNames
>> ccwsge0155
>>          1
>> Performing genomic alignments for 1 samples. See progress in the log file:
>> /scratch/4401022.1.huge/QuasR_log_41394115a102.txt
>> Error in unserialize(node$con) : error reading from connection
>> Calls: qAlign ... FUN -> recvData -> recvData.SOCKnode -> unserialize
>> Execution halted
> 
> The error that you get is not created within QuasR; my guess is that it
> comes from the "parallel" package, indicating that something goes wrong
> when using your cluster object "cl".
> 
> I would suggest testing whether your cluster object works fine. It would
> help to know if the error message appears immediately after you call
> qAlign(), or if it takes some time to process. Also, it would be great
> to see the content of the QuasR log file.
> 
> Here is a simple test you could try to check your cluster object/connection:
> parLapply(cl, seq_along(cl), function(i) Sys.info())
> 
> As a result, you should get Sys.info() output from each of the cluster
> nodes.
> 
> 
>> 
>> I also tryied to modify the multicore option
>> 
>> cl <- makeCluster(detectCores())
>> 
>> And my job is killed because it uses more memory ( Max vmem = 17.118G) than
>> allowed (16G)
> With splicedAlignment=TRUE, QuasR will run spliceMap for aligning your
> reads, which may require several GB of memory per node in your cluster
> object. You can avoid the memory overflow by reducing the number of
> nodes in your cluster object, e.g. by:
> 
> cl <- makeCluster(4)
> 
> which should run through on your machine with 16GB of memory.
> 
> Best,
> Michael
> 
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor