[R-sig-hpc] mclapply: problem writing into a texfile within a loop

Mauricio Zambrano-Bigiarini mauricio.zambrano at jrc.ec.europa.eu
Wed Jul 11 09:07:34 CEST 2012


On 11/07/12 04:53, Simon Urbanek wrote:
>
> On Jul 10, 2012, at 9:28 AM, Mauricio Zambrano-Bigiarini wrote:
>
>> Dear list,
>>
>> While using the mclapply function provided by the multicore package, I notice that more lines are written than expected when writing the outputs of the simulations into a textfile, AFTER the call to 'mclapply'.
>>
>> At the other hand, by using the 'parallel' package I do not have any "additional" outputs.
>>
>> Below you can find a reproducible example:
>>
>> --------START--------
>>   fn<- function(x) {
>>     n<- length(x)
>>     return(1 + (1/4000) * sum(x^2) - prod(cos(x/sqrt(seq(1:n)))))
>> }
>> fn1<- function(i, x) fn(x[i,])
>>
>> nr<- 50 ;  X<- matrix(rnorm(1000), ncol=50, nrow=nr)
>>
>> #######################
>> # multicore: mclapply #
>> fname<- paste("~/logfile_multicore.txt", sep="")
>> TextFile<- file(fname , "w+")
>>
>> for (iter in 1:3) {
>>    library(multicore)
>>    set.seed(100)
>>    unlist(multicore::mclapply(1:nr, FUN=fn1, x=X, mc.cores=6))
>>    for (i in 1:2) {
>>     writeLines(c("iter:", as.character(iter), "   ;    i:", as.character(i) ), TextFile, sep="  ")
>>     writeLines("", TextFile)
>>    } # FOR i end
>>   } # FOR iter end
>>   close(TextFile)
>>
>> #  output:
>> #iter:  1     ;    i:  1  # it should not be here
>> #iter:  1     ;    i:  2  # it should not be here
>> #iter:  1     ;    i:  1  # it should not be here
>> #iter:  1     ;    i:  2  # it should not be here
>> #iter:  1     ;    i:  1  # it should not be here
>> #iter:  1     ;    i:  2  # it should not be here
>> #iter:  2     ;    i:  1  # it should not be here
>> #iter:  2     ;    i:  2  # it should not be here
>> #iter:  1     ;    i:  1
>> #iter:  1     ;    i:  2
>> #iter:  2     ;    i:  1
>> #iter:  2     ;    i:  2
>> #iter:  3     ;    i:  1
>> #iter:  3     ;    i:  2
>>
>> ############
>> # parallel #
>> fname<- paste("~/logfile_parallel.txt", sep="")
>> TextFile<- file(fname , "w+")
>>
>> for (iter in 1:3) {
>>    cl<- parallel:::makeCluster(6)
>>    set.seed(100)
>>    parApply(cl=cl,X,1,fn)
>>    stopCluster(cl)
>>    for (i in 1:2) {
>>      writeLines(c("iter:", as.character(iter), "   ;    i:", as.character(i) ), TextFile, sep="  ")
>>      writeLines("", TextFile)
>>    } # FOR i end
>>   } # FOR iter end
>>   close(TextFile)
>>
>> #  output:
>> #iter:  1     ;    i:  1
>> #iter:  1     ;    i:  2
>> #iter:  2     ;    i:  1
>> #iter:  2     ;    i:  2
>> #iter:  3     ;    i:  1
>> #iter:  3     ;    i:  2
>>
>> -----------END-----------------
>>
>>
>> The same results are obtained if the call to 'multicore::mclapply' is replaced by 'parallel::mclapply'.
>>
>>
>>> sessionInfo()
>> R version 2.15.0 (2012-03-30)
>> Platform: x86_64-redhat-linux-gnu (64-bit)
>>
>> locale:
>> [1] LC_CTYPE=en_GB.utf8       LC_NUMERIC=C
>> [3] LC_TIME=en_GB.utf8        LC_COLLATE=en_GB.utf8
>> [5] LC_MONETARY=en_GB.utf8    LC_MESSAGES=en_GB.utf8
>> [7] LC_PAPER=C                LC_NAME=C
>> [9] LC_ADDRESS=C              LC_TELEPHONE=C
>> [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C
>>
>> attached base packages:
>> [1] parallel  splines   stats     graphics  grDevices utils     datasets
>> [8] methods   base
>>
>> other attached packages:
>> [1] multicore_0.1-7
>>
>>
>>
>> Do you know if is it any way of avoiding the writing of additional lines after calling mclapply ?
>>
>
> Simply add
> flush(TextFile)
> after the second writeLines(). Since you're forking the processes without flushing the buffers, the buffers get flushed as the processes exit and thus creating each one copy of the unflushed output for each process. Obviously, using makeCluster() does't have that effect since it creates new, independent processes.

I didn't know I had to flush the buffers when using forking.

Thank you very much Simon.

All the best,

Mauricio

-- 
====================================================
Water Resources Unit
Institute for Environment and Sustainability (IES)
Joint Research Centre (JRC), European Commission
webinfo    : http://floods.jrc.ec.europa.eu/
====================================================
DISCLAIMER:
"The views expressed are purely those of the writer
and may not in any circumstances be regarded as sta-
ting an official position of the European Commission"
====================================================
Linux user #454569 -- Ubuntu user #17469
====================================================
"If you torture any data set long enough,
it will confess anything!" (Murray Lark)

>
> Cheers,
> Simon
>
>
>> Thanks in advance,
>>
>> Mauricio Zambrano-Bigiarini
>>
>> --
>> ====================================================
>> Water Resources Unit
>> Institute for Environment and Sustainability (IES)
>> Joint Research Centre (JRC), European Commission
>> webinfo    : http://floods.jrc.ec.europa.eu/
>> ====================================================
>> DISCLAIMER:\ "The views expressed are purely those of th...{{dropped:10}}
>>
>> _______________________________________________
>> R-sig-hpc mailing list
>> R-sig-hpc at r-project.org
>> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>>
>>
>
>



More information about the R-sig-hpc mailing list