[R-sig-hpc] mclapply: problem writing into a texfile within a loop
Simon Urbanek
simon.urbanek at r-project.org
Wed Jul 11 04:53:43 CEST 2012
On Jul 10, 2012, at 9:28 AM, Mauricio Zambrano-Bigiarini wrote:
> Dear list,
>
> While using the mclapply function provided by the multicore package, I notice that more lines are written than expected when writing the outputs of the simulations into a textfile, AFTER the call to 'mclapply'.
>
> At the other hand, by using the 'parallel' package I do not have any "additional" outputs.
>
> Below you can find a reproducible example:
>
> --------START--------
> fn <- function(x) {
> n <- length(x)
> return(1 + (1/4000) * sum(x^2) - prod(cos(x/sqrt(seq(1:n)))))
> }
> fn1 <- function(i, x) fn(x[i,])
>
> nr <- 50 ; X <- matrix(rnorm(1000), ncol=50, nrow=nr)
>
> #######################
> # multicore: mclapply #
> fname <- paste("~/logfile_multicore.txt", sep="")
> TextFile <- file(fname , "w+")
>
> for (iter in 1:3) {
> library(multicore)
> set.seed(100)
> unlist(multicore::mclapply(1:nr, FUN=fn1, x=X, mc.cores=6))
> for (i in 1:2) {
> writeLines(c("iter:", as.character(iter), " ; i:", as.character(i) ), TextFile, sep=" ")
> writeLines("", TextFile)
> } # FOR i end
> } # FOR iter end
> close(TextFile)
>
> # output:
> #iter: 1 ; i: 1 # it should not be here
> #iter: 1 ; i: 2 # it should not be here
> #iter: 1 ; i: 1 # it should not be here
> #iter: 1 ; i: 2 # it should not be here
> #iter: 1 ; i: 1 # it should not be here
> #iter: 1 ; i: 2 # it should not be here
> #iter: 2 ; i: 1 # it should not be here
> #iter: 2 ; i: 2 # it should not be here
> #iter: 1 ; i: 1
> #iter: 1 ; i: 2
> #iter: 2 ; i: 1
> #iter: 2 ; i: 2
> #iter: 3 ; i: 1
> #iter: 3 ; i: 2
>
> ############
> # parallel #
> fname <- paste("~/logfile_parallel.txt", sep="")
> TextFile <- file(fname , "w+")
>
> for (iter in 1:3) {
> cl <- parallel:::makeCluster(6)
> set.seed(100)
> parApply(cl=cl,X,1,fn)
> stopCluster(cl)
> for (i in 1:2) {
> writeLines(c("iter:", as.character(iter), " ; i:", as.character(i) ), TextFile, sep=" ")
> writeLines("", TextFile)
> } # FOR i end
> } # FOR iter end
> close(TextFile)
>
> # output:
> #iter: 1 ; i: 1
> #iter: 1 ; i: 2
> #iter: 2 ; i: 1
> #iter: 2 ; i: 2
> #iter: 3 ; i: 1
> #iter: 3 ; i: 2
>
> -----------END-----------------
>
>
> The same results are obtained if the call to 'multicore::mclapply' is replaced by 'parallel::mclapply'.
>
>
> > sessionInfo()
> R version 2.15.0 (2012-03-30)
> Platform: x86_64-redhat-linux-gnu (64-bit)
>
> locale:
> [1] LC_CTYPE=en_GB.utf8 LC_NUMERIC=C
> [3] LC_TIME=en_GB.utf8 LC_COLLATE=en_GB.utf8
> [5] LC_MONETARY=en_GB.utf8 LC_MESSAGES=en_GB.utf8
> [7] LC_PAPER=C LC_NAME=C
> [9] LC_ADDRESS=C LC_TELEPHONE=C
> [11] LC_MEASUREMENT=en_GB.utf8 LC_IDENTIFICATION=C
>
> attached base packages:
> [1] parallel splines stats graphics grDevices utils datasets
> [8] methods base
>
> other attached packages:
> [1] multicore_0.1-7
>
>
>
> Do you know if is it any way of avoiding the writing of additional lines after calling mclapply ?
>
Simply add
flush(TextFile)
after the second writeLines(). Since you're forking the processes without flushing the buffers, the buffers get flushed as the processes exit and thus creating each one copy of the unflushed output for each process. Obviously, using makeCluster() does't have that effect since it creates new, independent processes.
Cheers,
Simon
> Thanks in advance,
>
> Mauricio Zambrano-Bigiarini
>
> --
> ====================================================
> Water Resources Unit
> Institute for Environment and Sustainability (IES)
> Joint Research Centre (JRC), European Commission
> webinfo : http://floods.jrc.ec.europa.eu/
> ====================================================
> DISCLAIMER:\ "The views expressed are purely those of th...{{dropped:10}}
>
> _______________________________________________
> R-sig-hpc mailing list
> R-sig-hpc at r-project.org
> https://stat.ethz.ch/mailman/listinfo/r-sig-hpc
>
>
More information about the R-sig-hpc
mailing list