[Bioc-devel] About the size limitation of the package

You Zhou youzhou|e@rn|ng @end|ng |rom gm@||@com
Thu May 27 10:25:31 CEST 2021


Hi Vincent,

Ah that sounds really cool. I will also try it out.

Kind regards,
You Zhou

发件人: Vincent Carey <stvjc using channing.harvard.edu>
日期: 2021年5月27日 星期四 00:53
收件人: Stuart Lee <lee.s using wehi.edu.au>
抄送: "Kern, Lori" <Lori.Shepherd using roswellpark.org>, You Zhou <youzhoulearning using gmail.com>, "bioc-devel using r-project.org" <bioc-devel using r-project.org>
主题: Re: [Bioc-devel] About the size limitation of the package



On Wed, May 26, 2021 at 5:55 PM Stuart Lee <lee.s using wehi.edu.au<mailto:lee.s using wehi.edu.au>> wrote:
Hi You and Lori,

Are fitted models in scope for ExperimentHub? I thought it was more for data. Maybe there should be a ModelHub for developers to include trained models from papers in their packages?

@You: if that model has been fitted in R take a look at https://github.com/tidymodels/butcher for some ways of reducing it’s size.

Thanks for these suggestions Stuart!  butcher certainly seems relevant.  I tried it out on
the adabag output in You's package was able to effectuate some nice reductions

> tr1 = x$trees[[1]]
> obj_size(tr1)
340,192 B
> obj_size(axe_data(axe_call(axe_fitted(tr1))))
171,896 B

So this in conjunction with xz compression could make this a moot point for @You Zhou.

As for the ModelHub, two thoughts.  First, I'd be more inclined at this stage to partner with a
system like kipoi.org<http://kipoi.org>, with fitted models archived there and retrieved by API as needed by bioc
packages.  I wonder if there are any good examples of this by now.

Second, although I don't feel we have capacity in core to introduce a new Hub at just this point,
I think we'd be able to help a motivated community-based team to produce one -- if kipoi suggestion isn't
viable -- utilizing some Azure resources that have been contributed by Microsoft Genomics.  Interested
parties should write to the list.

I don't see a bioc slack channel devoted to AI/ML and maybe there would be good traffic on one.
This "task area" could be added to biocchallenges, or could be a topic for a developer forum meeting.


Thanks
Stuart
________________________________
From: Bioc-devel <bioc-devel-bounces using r-project.org<mailto:bioc-devel-bounces using r-project.org>> on behalf of Kern, Lori <Lori.Shepherd using RoswellPark.org>
Sent: Wednesday, 26 May 2021 10:01 PM
To: You Zhou <youzhoulearning using gmail.com<mailto:youzhoulearning using gmail.com>>; bioc-devel using r-project.org<mailto:bioc-devel using r-project.org> <bioc-devel using r-project.org<mailto:bioc-devel using r-project.org>>
Subject: Re: [Bioc-devel] About the size limitation of the package

Please consider using Experiment Hub to host the large data file. More information can be found here:
https://bioconductor.org/packages/devel/bioc/vignettes/AnnotationHub/inst/doc/CreateAHubPackage.html

Cheers,



Lori Shepherd

Bioconductor Core Team

Roswell Park Comprehensive Cancer Center

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263

________________________________
From: Bioc-devel <bioc-devel-bounces using r-project.org<mailto:bioc-devel-bounces using r-project.org>> on behalf of You Zhou <youzhoulearning using gmail.com<mailto:youzhoulearning using gmail.com>>
Sent: Wednesday, May 26, 2021 5:09 AM
To: bioc-devel using r-project.org<mailto:bioc-devel using r-project.org> <bioc-devel using r-project.org<mailto:bioc-devel using r-project.org>>
Subject: [Bioc-devel] About the size limitation of the package

Dear Bioc team,


I am compiling a package �m6Aboost� and planning to submit it in the Bioconductor. This package using a trained machine learning model to identify the correct m6A signals from the miCLIP2 data set (more detail about this machine learning model can be found in our paper https://www.biorxiv.org/content/10.1101/2020.12.20.423675v1).



Now I meet a problem: the size of this machine learning model is 10 Mb, which is bigger than 5 Mb. Since this model is crucial for the package, I was wondering whether I can ignore the warning message about the size limitation. Thank you : )

Best regards,
You Zhou

        [[alternative HTML version deleted]]



This email message may contain legally privileged and/or confidential information.  If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited.  If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.
        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel using r-project.org<mailto:Bioc-devel using r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

        [[alternative HTML version deleted]]

_______________________________________________
Bioc-devel using r-project.org<mailto:Bioc-devel using r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

The information in this e-mail is intended only for the person to whom it is
addressed. If you believe this e-mail was sent to you in error and the e-mail
contains patient information, please contact the Partners Compliance HelpLine at
http://www.partners.org/complianceline . If the e-mail was sent to you in error
but does not contain patient information, please contact the sender and properly
dispose of the e-mail.

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list