[Bioc-devel] check error in SeqVarTools

Stephanie Gogarten @dmorr|@ @end|ng |rom uw@edu
Tue Apr 30 20:54:44 CEST 2019


If a user gave that exact set of random numbers as input, then a missing
value would be the correct result; it's just the unit test that makes an
assumption that all values are not missing. So the problem is in the test,
not the code itself.

set.seed() is probably the right thing to do here, but as Herve points out,
actually running a different set of random numbers every time is better at
identifying bugs that only crop up in very particular circumstances - in
this case it is not a bug, but in another case it might be.

I would welcome advice on the best way to write unit tests for fairly
complicated statistical tests where I don't know beforehand exactly what
the output should be, only that the code should run without errors and
produce some output. I had initially thought that checking the output is
non-missing would be a good solution, but that clearly doesn't work because
it turns out that in very rare cases, the output should actually be a
missing value. (Generating a random set of genotypes enough times will
sometimes produce a monomorphic variant, or similar edge case where testing
for association is invalid.)

Should I push a fix to the unit test today, or wait to change anything
until after release?

thanks,
Stephanie

On Tue, Apr 30, 2019 at 10:00 AM Pages, Herve <hpages using fredhutch.org> wrote:

> Hi Stephanie,
>
> In addition to Lori's advice to make sure that your code works whatever
> random numbers get generated (otherwise the current issue could also
> affect your users, not just your unit tests), it's recommended to call
> set.seed() before you start generating the random numbers in your test.
> This will ensure that the sequence of random numbers that gets generated
> is the same every time the tests are run, and on all platforms.
>
> The irony is that you probably wouldn't have been able to catch this
> problem if you had been using set.seed() already. But still,
> reproducible/deterministic unit tests are a must.
>
> Hope this helps,
>
> H.
>
>
> On 4/30/19 07:44, Shepherd, Lori wrote:
> > How are you generating random numbers?  How is the generation ending up
> with a missing number?  This seems like it might be an edge case that
> should be caught in your code?
> >
> >
> >
> > Lori Shepherd
> >
> > Bioconductor Core Team
> >
> > Roswell Park Cancer Institute
> >
> > Department of Biostatistics & Bioinformatics
> >
> > Elm & Carlton Streets
> >
> > Buffalo, New York 14263
> >
> > ________________________________
> > From: Bioc-devel <bioc-devel-bounces using r-project.org> on behalf of
> Stephanie Gogarten <sdmorris using uw.edu>
> > Sent: Monday, April 29, 2019 4:41:30 PM
> > To: bioc-devel using r-project.org
> > Subject: [Bioc-devel] check error in SeqVarTools
> >
> > Hi,
> >
> > I see that SeqVarTools has an error in check on malbec2, after passing
> > checks for a week with no changes to the code. This is a result of a unit
> > test which generates random numbers; once in a while a particular set of
> > random numbers ends up generating a missing value in a calculation which
> > causes the unit test to fail. This particular failure is very poorly
> timed
> > since it happens to be the date of the deadline for packages to pass
> > checks. I fully expect tomorrow's build to pass again, so I hope
> > SeqVarTools can remain in the release.
> >
> > thanks,
> > Stephanie
> >
> >          [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioc-devel using r-project.org mailing list
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ER-QOL412sevDpv7_bTLmeLMLx61JY5L4ppkM3OI9Ao&s=riDnpgNorQlwq4BZX1CREFCOPDKqmgAuOS3zij2jlMo&e=
> >
> >
> > This email message may contain legally privileged and/or confidential
> information.  If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited.  If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
> >       [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioc-devel using r-project.org mailing list
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__stat.ethz.ch_mailman_listinfo_bioc-2Ddevel&d=DwICAg&c=eRAMFD45gAfqt84VtBcfhQ&r=BK7q3XeAvimeWdGbWY_wJYbW0WYiZvSXAJJKaaPhzWA&m=ER-QOL412sevDpv7_bTLmeLMLx61JY5L4ppkM3OI9Ao&s=riDnpgNorQlwq4BZX1CREFCOPDKqmgAuOS3zij2jlMo&e=
>
> --
> Hervé Pagès
>
> Program in Computational Biology
> Division of Public Health Sciences
> Fred Hutchinson Cancer Research Center
> 1100 Fairview Ave. N, M1-B514
> P.O. Box 19024
> Seattle, WA 98109-1024
>
> E-mail: hpages using fredhutch.org
> Phone:  (206) 667-5791
> Fax:    (206) 667-1319
>
>

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list