[BioC] QuasR on Linux Cluster
Ugo Borello
ugo.borello at inserm.fr
Mon Oct 21 15:09:49 CEST 2013
Thank you Michael,
My bad, I am not able to find the QuasR_log at the moment. Anyway the last
step was the .sam file. QuasR was not proceeding in converting the .sam file
to a .bam file.
In attachment some other info on the running job before death.
Those refer to a case where cl<- makeCluster(1).
I run your test and I got:
> library(parallel)
> cl<- makeCluster(detectCores())
> info<- parLapply(cl, seq_along(cl), function(i) Sys.info())
> info
[[1]]
sysname release
"Linux" "2.6.18-348.3.1.el5"
version nodename
"#1 SMP Tue Mar 5 13:19:32 EST 2013" "ccwsge0053"
machine login
"x86_64" "unknown"
user effective_user
"uborello" "uborello"
The same for the 32 nodes.
Then I run:
> library(parallel)
> type <- if (exists("mcfork", mode="function")) "FORK" else "PSOCK"
> type
[1] "PSOCK"
> cores <- getOption("mc.cores", detectCores())
> cl <- makeCluster(cores, type=type)
> cl
socket cluster with 32 nodes on host 'localhost'
> results <- parLapply(cl, 1:100, sqrt)
> sum(unlist(results))
[1] 671.4629
> stopCluster(cl)
I don't know if this could help.
Any suggestions?
Ugo
> From: Michael Stadler <michael.stadler at fmi.ch>
> Date: Mon, 21 Oct 2013 11:30:27 +0200
> To: <bioconductor at r-project.org>
> Subject: Re: [BioC] QuasR on Linux Cluster
>
> Hi Ugo,
>
> On 18.10.2013 13:56, Ugo Borello wrote:> Hi all,
>> I am trying to use QuasR on a Linux Cluster:1 machine/multiple cores.
>>
>> I run:
>> library(QuasR)
>> library(BSgenome.Mmusculus.UCSC.mm10)
>>
>> cl <- makeCluster(1)
>>
>> sampleFile <- "sampleFile.txt"
>>
>> genomeName <- "BSgenome.Mmusculus.UCSC.mm10"
>>
>> proj <- qAlign(sampleFile, genome= genomeName, splicedAlignment=TRUE,
>> clObj=cl)
>>
>> And I get
>>> proj <- qAlign(sampleFile, genome= genomeName, splicedAlignment=TRUE,
>> clObj=cl)
>> alignment files missing - need to:
>> create 1 genomic alignment(s)
>> Testing the compute nodes...OK
>> Loading QuasR on the compute nodes...OK
>> Available cores:
>> nodeNames
>> ccwsge0155
>> 1
>> Performing genomic alignments for 1 samples. See progress in the log file:
>> /scratch/4401022.1.huge/QuasR_log_41394115a102.txt
>> Error in unserialize(node$con) : error reading from connection
>> Calls: qAlign ... FUN -> recvData -> recvData.SOCKnode -> unserialize
>> Execution halted
>
> The error that you get is not created within QuasR; my guess is that it
> comes from the "parallel" package, indicating that something goes wrong
> when using your cluster object "cl".
>
> I would suggest testing whether your cluster object works fine. It would
> help to know if the error message appears immediately after you call
> qAlign(), or if it takes some time to process. Also, it would be great
> to see the content of the QuasR log file.
>
> Here is a simple test you could try to check your cluster object/connection:
> parLapply(cl, seq_along(cl), function(i) Sys.info())
>
> As a result, you should get Sys.info() output from each of the cluster
> nodes.
>
>
>>
>> I also tryied to modify the multicore option
>>
>> cl <- makeCluster(detectCores())
>>
>> And my job is killed because it uses more memory ( Max vmem = 17.118G) than
>> allowed (16G)
> With splicedAlignment=TRUE, QuasR will run spliceMap for aligning your
> reads, which may require several GB of memory per node in your cluster
> object. You can avoid the memory overflow by reducing the number of
> nodes in your cluster object, e.g. by:
>
> cl <- makeCluster(4)
>
> which should run through on your machine with 16GB of memory.
>
> Best,
> Michael
>
> _______________________________________________
> Bioconductor mailing list
> Bioconductor at r-project.org
> https://stat.ethz.ch/mailman/listinfo/bioconductor
> Search the archives:
> http://news.gmane.org/gmane.science.biology.informatics.conductor
More information about the Bioconductor
mailing list