[R] writeLines + foreach/doMC
Mario Valle
mvalle at cscs.ch
Mon Jul 4 14:55:20 CEST 2011
Read something about parallel processing and how I/O should be done by a
single process.
Suggestion: write a different file from each thread then combine the
results with cat or similar.
Hope it helps
mario
On 04-Jul-11 11:58, Ramzi TEMANNI wrote:
> Hi
> I'm processing sequencing data trying to collapsing the locations of each
> unique sequence and write the results to a file (as storing that in a table
> will require 10GB mem at least)
> so I wrote a function that, given a sequence id, provide the needed line to
> be stored
> library(doMC) # load library
> registerDoMC(12) # assign the Number of CPU
>
>
> fileConn<-file(paste(fq_file,"_SeqID.txt",sep=""),open = "at") # open
> connection
> writeLines(paste("ReadID","Freq","Seq","LOC_UG","Nb_UG_Seq",sep="\t"),
> fileConn) # write header
> foreach(i=1:length(uniq.Seq)) %dopar% # for eqch unique sequence
> {
> writeLines(paste(gettable1(uniq.Seq[i]),collapse=" "), fileConn) #write
> the the results line
> }
> close(fileConn)
>
> the code excute well, but the problem is that some lines are wired:
> The header and lot of lines are ok :
> ReadID Freq Seq LOC_UG Nb_UG_Seq
> HWI-EA332_0036:5:16:9530:21025#ATGC/1 XXXXXXXXXXXXXXXXXXXX 2
> XXXXX_10130:489:+,XXXXX_10130:489:+ 2
> HWI-EA332_0036:5:117:6674:4940#ATGC/1 XXXXXXXXXXXXXXXXXXXX 1
> XXXXX:432:-,XXXXX:432:- 2
> HWI-EA332_0036:5:62:15592:7375#ATGC/1 XXXXXXXXXXXXXXXXXXXX 2
> XXXXX_22660:253:+,XXXXX_22660:253:+ 2
> HWI-EA332_0036:5:110:14349:8422#ATGC/1 XXXXXXXXXXXXXXXXXXXX 4
> XXXXX_13806:399:+,XXXXX_13806:399:+,XXXXX_27263:481:+,XXXXX_27263:481:+ 4
> other looks wired
> HWI-EA332_0036:5:17:1400ReadID Freq Seq LOC_UG Nb_UG_Seq
> HWI-EA332_0036:5:61:7734:4201ReadID Freq Seq LOC_UG Nb_UG_Seq
> HWI-EA332_0036:5:117:5361:10666#ATGReadID Freq Seq LOC_UG
> Nb_UG_Seq
> HWI-EA332_0036:5:115:7421:20664#ATGC/1 GATCReadID Freq Seq
> LOC_UG Nb_UG_Seq
> HWI-EA332_0036:5:175:95:- 2
> HWI-EA332_0036:5JCVI_35536:444:+ 2
> XXXXXXXXX 1 XXXXX_22484:571:-,XXXXX_22484:571:- 2
>
> Is this due to the fact that one process start to write prior the other has
> finished ?
> Is there a way to solve this problem ?
> Any suggestions would be greatly appreciated.
> Thanks and have a nice day.
>
>
> Best,
> Ramzi TEMANNI
> http://www.linkedin.com/in/ramzitemanni
>
> [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
--
Ing. Mario Valle
Data Analysis and Visualization Group | http://www.cscs.ch/~mvalle
Swiss National Supercomputing Centre (CSCS) | Tel: +41 (91) 610.82.60
v. Cantonale Galleria 2, 6928 Manno, Switzerland | Fax: +41 (91) 610.82.82
More information about the R-help
mailing list