[Rd] mccollect with NULL in R 3.6

Tomas Kalibera tom@@@k@||ber@ @end|ng |rom gm@||@com
Fri May 3 16:34:24 CEST 2019


On 5/3/19 3:04 PM, Gergely Daróczi wrote:
> On Thu, May 2, 2019 at 7:24 PM Tomas Kalibera <tomas.kalibera using gmail.com> wrote:
>> On 5/1/19 12:25 AM, Gergely Daróczi wrote:
>>> Dear All,
>>>
>>> I'm running into issues with calling mccollect on a list containing NULL
>>> using R 3.6 (this used to work in 3.5.3):
>>>
>>> jobs <- lapply(
>>>       list(NULL, 'foobar'),
>>>       function(x) mcparallel(identity(x)))
>>> mccollect(jobs, wait = FALSE, timeout = 0)
>>> #> Error in names(res) <- pnames[match(s, pids)] :
>>> #>   'names' attribute [2] must be the same length as the vector [1]
>>>
>>> Note, setting a "name" for the jobs does not help, but the above works with
>>> "wait=TRUE", and also if I change the order of NULL and "foobar", although
>>> in that case, the second value (NULL) is ommitted.  It also works with
>>> mclapply fine.
>>>
>>> Any ideas/suggestion on how to get mccollect work with the above example?.
>> NULL is not a valid job identification. Perhaps mccollect() could give a
>> clearer error message, but I don't see, given its documentation, what
>> else than throwing an error it should do. What is the problem you were
>> trying to solve?
> Thank you very much for looking into this!
>
> What was interesting to me is that it used to work before 3.6 -- I
> have a script iterating over a list of data frames to train models,
> but it started to fail with the 3.6 release.
>
> The "NULL is not a valid job identification" problem doesn't seem to
> stand for my production job, as each list element has a proper name,
> but I think I can reproduce this with this minimal example as well:
>
> library(parallel)
> jobs <- lapply(1:2, function(x) {
>      mcparallel(if (x == 1) NULL else x, name = as.character(x))
> })
> mccollect(jobs, wait = FALSE, timeout = 2)
> #> Error in names(res) <- pnames[match(s, pids)] :
> #>   'names' attribute [1] must be the same length as the vector [0]
>
> So the jobs have proper name, but the NULL return value is causing
> problems. Note, that it only causes problems when the NULL value is
> the first, eg switching 1 and 2 works, also running this on 1:3 and
> returning NULL on 2 etc.
Thanks for the clarification, I got confused by that you wrote about 
"calling mccollect on a list containing NULL", so I was answering a 
different question. Now I see you actually meant collecting results from 
jobs that return NULL, and even though that writing jobs that return 
NULL is incorrect, I see that there is really a bug that causes the R error.
> Now, I'm aware that 7 months ago this was added to the docs at
> https://github.com/wch/r-source/commit/f0d15be765dcf92e2349429428d49cd5b212abb4
> that NULL should not be returned, so it seems to be a user error on my
> end, but it seems to fail only when NULL is the first returned element
> in mccollect, and working OK eg when NULL is the 2nd or other element
> (although with side effects, eg missed elements).
NULL has been reserved even well before that documentation update, and 
if it somewhat worked as a result, it had been a coincidence. The 
documentation change was only to make it more obvious, but it already 
said "‘mccollect’ will return ‘NULL’ for a terminating job that has sent 
its results already after which the job is no longer available."
> So maybe failing with an explicit error message whenever mccollect
> hits a NULL for consistency might help here instead of skipping
> "delivered.result <- delivered.result + 1L" when the returned value is
> not raw at https://github.com/wch/r-source/commit/f0d15be765dcf92e2349429428d49cd5b212abb4#diff-e634fbaed323aac88667e7826865b160R72
> ? Or even better (at least for my use case), maybe allowing to return
> NULL and throwing the warning on line 108 in that case.

Thanks for the report, I've fixed mccollect() in R-devel to return NULL 
from the job. To be ported to R-pathed unless problems appear.

Best
Tomas

>
> Thanks for considering this,
> Gergely

>
>> Best
>> Tomas
>>
>>> Thanks,
>>> Gergely
>>>
>>>        [[alternative HTML version deleted]]
>>>
>>> ______________________________________________
>>> R-devel using r-project.org mailing list
>>> https://stat.ethz.ch/mailman/listinfo/r-devel
>>



More information about the R-devel mailing list