[Bioc-devel] About the size limitation of the package

Vincent Carey @tvjc @end|ng |rom ch@nn|ng@h@rv@rd@edu
Thu May 27 00:53:13 CEST 2021


On Wed, May 26, 2021 at 5:55 PM Stuart Lee <lee.s using wehi.edu.au> wrote:

> Hi You and Lori,
>
> Are fitted models in scope for ExperimentHub? I thought it was more for
> data. Maybe there should be a ModelHub for developers to include trained
> models from papers in their packages?
>
> @You: if that model has been fitted in R take a look at
> https://github.com/tidymodels/butcher for some ways of reducing it’s size.
>

Thanks for these suggestions Stuart!  butcher certainly seems relevant.  I
tried it out on
the adabag output in You's package was able to effectuate some nice
reductions

> tr1 = x$trees[[1]]
> obj_size(tr1)
340,192 B
> obj_size(axe_data(axe_call(axe_fitted(tr1))))
171,896 B

So this in conjunction with xz compression could make this a moot point
for @You Zhou.

As for the ModelHub, two thoughts.  First, I'd be more inclined at this
stage to partner with a
system like kipoi.org, with fitted models archived there and retrieved by
API as needed by bioc
packages.  I wonder if there are any good examples of this by now.

Second, although I don't feel we have capacity in core to introduce a new
Hub at just this point,
I think we'd be able to help a motivated community-based team to produce
one -- if kipoi suggestion isn't
viable -- utilizing some Azure resources that have been contributed by
Microsoft Genomics.  Interested
parties should write to the list.

I don't see a bioc slack channel devoted to AI/ML and maybe there would be
good traffic on one.
This "task area" could be added to biocchallenges, or could be a topic for
a developer forum meeting.


> Thanks
> Stuart
> ________________________________
> From: Bioc-devel <bioc-devel-bounces using r-project.org> on behalf of Kern,
> Lori <Lori.Shepherd using RoswellPark.org>
> Sent: Wednesday, 26 May 2021 10:01 PM
> To: You Zhou <youzhoulearning using gmail.com>; bioc-devel using r-project.org <
> bioc-devel using r-project.org>
> Subject: Re: [Bioc-devel] About the size limitation of the package
>
> Please consider using Experiment Hub to host the large data file. More
> information can be found here:
>
> https://bioconductor.org/packages/devel/bioc/vignettes/AnnotationHub/inst/doc/CreateAHubPackage.html
>
> Cheers,
>
>
>
> Lori Shepherd
>
> Bioconductor Core Team
>
> Roswell Park Comprehensive Cancer Center
>
> Department of Biostatistics & Bioinformatics
>
> Elm & Carlton Streets
>
> Buffalo, New York 14263
>
> ________________________________
> From: Bioc-devel <bioc-devel-bounces using r-project.org> on behalf of You Zhou
> <youzhoulearning using gmail.com>
> Sent: Wednesday, May 26, 2021 5:09 AM
> To: bioc-devel using r-project.org <bioc-devel using r-project.org>
> Subject: [Bioc-devel] About the size limitation of the package
>
> Dear Bioc team,
>
>
> I am compiling a package �m6Aboost� and planning to submit it in the
> Bioconductor. This package using a trained machine learning model to
> identify the correct m6A signals from the miCLIP2 data set (more detail
> about this machine learning model can be found in our paper
> https://www.biorxiv.org/content/10.1101/2020.12.20.423675v1).
>
>
>
> Now I meet a problem: the size of this machine learning model is 10 Mb,
> which is bigger than 5 Mb. Since this model is crucial for the package, I
> was wondering whether I can ignore the warning message about the size
> limitation. Thank you : )
>
> Best regards,
> You Zhou
>
>         [[alternative HTML version deleted]]
>
>
>
> This email message may contain legally privileged and/or confidential
> information.  If you are not the intended recipient(s), or the employee or
> agent responsible for the delivery of this message to the intended
> recipient(s), you are hereby notified that any disclosure, copying,
> distribution, or use of this email message is prohibited.  If you have
> received this message in error, please notify the sender immediately by
> e-mail and delete this email message from your computer. Thank you.
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>

-- 
The information in this e-mail is intended only for the ...{{dropped:18}}



More information about the Bioc-devel mailing list