[R] interaction between clusterMap(), read.csv() and try() - try does not catch error

Strunk, Jacob (DNR) Jacob.Strunk at dnr.wa.gov
Mon Aug 8 22:06:51 CEST 2016


Ok - got it, I can handle that. Thank you Luke!


Jacob L Strunk
_______________________________________
From: luke-tierney at uiowa.edu [luke-tierney at uiowa.edu]
Sent: Monday, August 08, 2016 12:17 PM
To: Strunk, Jacob (DNR)
Cc: r-help at r-project.org
Subject: Re: [R] interaction between clusterMap(), read.csv() and try() - try does not catch error

try is working fine. The problem is that your remote function is
returning the try-error result, which the parallel infrastructure is
interpreting as an error on the remote node, since the remote calling
infrastructure is using try as well. This could be implemented more
robustly, but it would probably be better in any case your code to use
can use tryCatch and have the error. function return something easier
to work with, like NULL.

Best,

luke

On Mon, 8 Aug 2016, Strunk, Jacob (DNR) wrote:

> Hello I am attempting to process a list of csv files in parallel, some of which may be empty and fail with read.csv. I tend to use clusterMap as my go-to parallel function but have run into an interesting behavior. The behavior is that try(read.csv(x)) does not catch read errors resulting from having an empty csv file inside of clusterMap. I have not tested this with other functions (e.g. read.table, mean, etc.). The parLapply function does, it appears, correctly catch the errors. Any suggestions on how I should code with clusterMap such that try is guaranteed to catch the error?
>
>
> I am working on windows server 2012
> I have the latest version of R and parallel
> I am executing the code from within the rstudio ide Version 0.99.896
>
> Here is a demonstration of the failure
>
> R code used in demonstration:
> #prepare csv files - an empty file and a file with data
> close(file("c:/temp/badcsv.csv",open="w"))
> write.table(data.frame(x=2),"c:/temp/goodcsv.csv")
>
> #prepare a parallel cluster
> clus0=makeCluster(1, rscript_args = "--no-site-file")
>
> #read good / bad files in parallel with parLapply - which succeeds: try Does catch err
> x1=parLapply(clus0,c("c:/temp/badcsv.csv","c:/temp/goodcsv.csv"),function(...)try(read.csv(...)))
> print(x1)
>
> #read good / bad files in parallel with clusterMap - which fails: try does Not catch error
> x0=clusterMap(clus0,function(...)try(read.csv(...)),c("c:/temp/badcsv.csv","c:/temp/goodcsv.csv"),SIMPLIFY=F)
> print(x0)
>
> R output:
>
>> #prepare csv files - an empty file and a file with data
>> close(file("c:/temp/badcsv.csv",open="w"))
>> write.table(data.frame(x=2),"c:/temp/goodcsv.csv")
>>
>> #prepare a parallel cluster
>> clus0=makeCluster(1, rscript_args = "--no-site-file")
>>
>> #read good / bad files in parallel with parLapply - which succeeds: try Does catch err
>> x1=parLapply(clus0,c("c:/temp/badcsv.csv","c:/temp/goodcsv.csv"),function(...)try(read.csv(...)))
>> print(x1)
> [[1]]
> [1] "Error in read.table(file = file, header = header, sep = sep, quote = quote,  : \n  no lines available in input\n"
> attr(,"class")
> [1] "try-error"
> attr(,"condition")
> <simpleError in read.table(file = file, header = header, sep = sep, quote = quote,     dec = dec, fill = fill, comment.char = comment.char, ...): no lines available in input>
>
> [[2]]
>    x
> 1 1 2
>
>>
>> #read good / bad files in parallel with clusterMap - which fails: try does Not catch error
>> x0=clusterMap(clus0,function(...)try(read.csv(...)),c("c:/temp/badcsv.csv","c:/temp/goodcsv.csv"),SIMPLIFY=F)
> Error in checkForRemoteErrors(val) :
>  one node produced an error: Error in read.table(file = file, header = header, sep = sep, quote = quote,  :
>  no lines available in input
>> print(x0)
> Error in print(x0) : object 'x0' not found
>>
>
>
> Thanks for any help,
> Jacob
>
>
>       [[alternative HTML version deleted]]
>
> ______________________________________________
> R-help at r-project.org mailing list -- To UNSUBSCRIBE and more, see
> https://stat.ethz.ch/mailman/listinfo/r-help
> PLEASE do read the posting guide http://www.R-project.org/posting-guide.html
> and provide commented, minimal, self-contained, reproducible code.
>

--
Luke Tierney
Ralph E. Wareham Professor of Mathematical Sciences
University of Iowa                  Phone:             319-335-3386
Department of Statistics and        Fax:               319-335-3017
    Actuarial Science
241 Schaeffer Hall                  email:   luke-tierney at uiowa.edu
Iowa City, IA 52242                 WWW:  http://www.stat.uiowa.edu



More information about the R-help mailing list