[R] Reproducibility Between Local and Remote Computer with R

stephen sefick @@e||ck @end|ng |rom gm@||@com
Sun Aug 9 15:42:38 CEST 2020


Hi Kevin,

I think Abby has suggested something similar to what I think the problem is
related to - environment setup.

Some possible solutions:
The renv and packrat packages are a way to version your packages to help
with reproducability. Anaconda might be a solution for the R version and
package version problem, if installed on your hpc. Docker could work as
well (maybe the best option if installed). There are other workarounds, but
I would have to know how your particular hpc/compute environment is set up
to comment further.

Brass tacks:
I think you need to ensure all your package versions (R and add-on
packages) are the same.

Fwiw,

Stephen

On Sun, Aug 9, 2020, 08:26 Kevin Egan <kevinegan31 using gmail.com> wrote:

> Hi Stephen,
>
> I believe I am using Renv, but on my remote computer I am running batch
> files.
>
> Thanks,
>
> Kevin
>
> On 8 Aug 2020, at 18:18, stephen sefick <ssefick using gmail.com> wrote:
>
> Caveat, I have only skimmed this email thread, so please forgive me if I
> have missed something.
>
> Are you able to use Renv, packrat, docker, or anaconda? Your compute
> environments are very different.
> Kindest regards,
>
> Stephen Sefick
>
> On Sat, Aug 8, 2020, 19:05 Abby Spurdle <spurdle.a using gmail.com> wrote:
>
>> Hi Kevin,
>>
>> Intuitively, the first step would be to ensure that all versions of R,
>> and all the R packages, are the same.
>>
>> However, you mention HPC.
>> And the glmnet package imports the foreach package, which appears
>> (after a quick glance) to support multi-core and parallel computing.
>>
>> If your code uses parallel computing (?), you may need to look at how
>> random numbers, and related results, are handled...
>>
>>
>> On Sun, Aug 9, 2020 at 1:14 AM Kevin Egan <kevinegan31 using gmail.com> wrote:
>> >
>> > I posted this question:
>> >
>> > I am currently using R , RStudio , and a remote computer (using an R
>> script) to run the same code. I start by using set.seed(123) in all three
>> versions of the code, then using glmnet to assess a matrix. Ultimately, I
>> am having trouble reproducing the results between my local and the remote
>> computer's results. I am using R version 4.0.2 locally, and R version 3.6.0
>> remote.
>> >
>> > After running several tests, I'm wondering if there is a difference
>> between the two versions in R which may lead to slightly different
>> coefficients. If anyone has any insight I would appreciate it.
>> >
>> > Thanks.
>> >
>> > and found that there were slight differences between using rnorm with
>> R-4.0.2 and R-3.6.0 but did not find any differences for runif between both
>> systems. In my original code, I am using rnorm and was wondering if this
>> may be the reason I am finding slight differences in coefficients for
>> glmnet and lars testing between using my local computer (R-4.0.2) and my
>> remote computer (R-3.6.0). I am running my code locally on a MacOSX and
>> remote on what I believe is an HPC.
>> >
>> > Thanks.
>> >         [[alternative HTML version deleted]]
>> >
>> > ______________________________________________
>> > R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> > https://stat.ethz.ch/mailman/listinfo/r-help
>> > PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> <http://www.r-project.org/posting-guide.html>
>> > and provide commented, minimal, self-contained, reproducible code.
>>
>> ______________________________________________
>> R-help using r-project.org mailing list -- To UNSUBSCRIBE and more, see
>> https://stat.ethz.ch/mailman/listinfo/r-help
>> PLEASE do read the posting guide
>> http://www.R-project.org/posting-guide.html
>> <http://www.r-project.org/posting-guide.html>
>> and provide commented, minimal, self-contained, reproducible code.
>>
>
>

	[[alternative HTML version deleted]]



More information about the R-help mailing list