[Bioc-devel] Including large files for the package

Kern, Lori Lor|@Shepherd @end|ng |rom Ro@we||P@rk@org
Thu Aug 31 18:01:13 CEST 2023


Hello,

Regarding Hub use:
What sort of information does the metadata contain? That would determine whether ExperimentHub or AnnotationHub is more appropriate. Is the file accessed directly from the http://ilincs.org/ portal with a url link or is there processing/filtering that occurs?  The hubs can access data stored on other websites/hosts as long as they are trusted sites (ilincs would fall in this category) if you can access it directly with a url link.  The way the hubs work is the data is stored elsewhere either directly from site access or on some hosting serve (S3, Azure, etc) if its processed. The data would be removed from directly being in the package, and downloaded then using the hub interface when needed (and also cached in the backend so its not done every time).





Lori Shepherd - Kern

Bioconductor Core Team

Roswell Park Comprehensive Cancer Center

Department of Biostatistics & Bioinformatics

Elm & Carlton Streets

Buffalo, New York 14263

________________________________
From: Bioc-devel <bioc-devel-bounces using r-project.org> on behalf of Vincent Carey <stvjc using channing.harvard.edu>
Sent: Thursday, August 31, 2023 8:29 AM
To: Martin Grigorov <martin.grigorov using gmail.com>
Cc: bioc-devel using r-project.org <bioc-devel using r-project.org>
Subject: Re: [Bioc-devel] Including large files for the package

On Thu, Aug 31, 2023 at 7:28 AM Martin Grigorov <martin.grigorov using gmail.com>
wrote:

> Hello,
>
> Perhaps you could use https://secure-web.cisco.com/1PWeIBsHtYFpwnIBjpsq_YN8z0VkqqbOqtHQk4ITS1RC58_4Mploz6OJS4-Uxw4jq_g9JHqlT9Wq6tkKR-aBwYiSF6Bf-ajT-d7vnHBJlAHNLxs2Y3F979xVFa07xAiyrpeXtgfU0dHry6aNaTQmruT5HzYIplDg0UVfcLK9976qFmnnwuRbo24PxtCSMLTLKbVqlHi_URSb7MYdKpuxIP8SmFalHHQUUZWSG9NT1XSeuTkw8pXPtGzJPB2vyj-zO3-cy9RUHz5gLoFe53a3qV2cRVz7ov7WXhLErjX9fqk7A-EQOQSq5QeyWzmoonEUu/https%3A%2F%2Fbioconductor.r-universe.dev%2FBiocFileCache to
> download the big file on demand.
> The benefit is that the file would be stored in ~/.cache/R/yourPackage/
> (for Linux; something similar for Windows/Mac) and reused between sessions.
>

Thanks Martin.  I think that is a possible approach, but the proposals at
http://secure-web.cisco.com/1PN99uHlZGkagOQGmEM4lhVob-mny_wuOMrU_eG-JFkBnBX5W-tXbKupcTbZ-gSq-XMcO9_rg2sGp_3KwriGP5nkPGjk_bL8O5IxcEaPE04uFIvB_UVQh-2NzX-1LfalQo2nPrpuxM3FDJJJPRBz8pjayIb27ThNpZZQI50lyjOLdJUikYdS5-Y4TlTMDGCPfs_854qpfJREWoKeYTJOpRb-95SzxaPxDp2qePIkigSmQzj1JrjIfCYyLGCVIIq1Zz1-kbIEqem7cvMtWe2ZE_Af1yG9wA-51shDuYxapn9yaETK7E8Rsg_OTsp4yfB-R/http%3A%2F%2Fcontributions.bioconductor.org%2Fnon-software.html%3Fq%3DAnnotationHub%23annotationexperiment-hub-packages
should also be considered.

Ali, if the documentation regarding *Hub contributions is unclear, please
file an
issue or write back here with the difficulties so that we can improve the
material and
the methods!

Thanks!



>
> Regards,
> Martin
>
> On Tue, Aug 29, 2023 at 5:15 AM Ali Sajid Imami <ali.sajid.imami using gmail.com
> >
> wrote:
>
> > Hi BioConductor Team,
> >
> > I am a PhD Candidate in the Cognitive Disorders Research lab at the
> > university of Toledo. I am responsible for a number of R packages and our
> > intention is to submit them to bioconductor over the next several
> months. I
> > had just submitted a package drugfindR (
> > https://secure-web.cisco.com/1cicrzPanVq35q1BPuFjU_LiICsEK7iZoXLM-t2R1mHcgZYx9SUW2VsKWpSf18Qth0RFcer0FVZwPETWM2KmL8gNtvqOXoL4pEnpyzZqLv1acHN06QD6rwkShy1iEZsPyZLIJHhtNgsJEt7_0s7gYZE98GqoE2RSVyYhNOPS_2ZakwjaFtb-w3_dJGmt7wV1GXpapSa6w5gLICAPUjaaw1jFLsgCc_2dCVuc0mX9VGYNJywp_SDKJH8ex4KX6Groq7ThXm-EQbmSxB8WVqCR0rb-vIqAyS2IC_suOg22e6PkjRwYqgwjtN4mf7i6xe7r2/https%3A%2F%2Fgithub.com%2FCogDisResLab%2FdrugfindR). This was immediately closed
> as
> > my repo had a single file over the 5MB limit.
> >
> > I wanted to ask both if you would reconsider/make an exception or guide
> me
> > in the right direction.
> >
> > This package serves as a way to quickly learch through the LINCS data
> > stored at the ilincs.org portal. The file in question is one of three
> > metadata files that allows the package to function efficiently and
> without
> > having to go through the expensive network requests. It would really be
> > helpful if we could include the file as is. I do not expect more files
> like
> > that to be added to the package at all.
> >
> > Barring that, I have seen the suggestion of using AnnotationHub or
> > ExperimentHub. While I have gone through the documentation, I'm not
> > entirely sure how those services work. Are those services where we can
> > store the data itself or we are expected to host the data elsewhere and
> > create lightweight "pointer" packages. Similarly, I'm not entirely sure
> > which Hub this would go to.
> >
> > Any advice or guidance will be appreciated.
> >
> >
> > Regards,
> > Dr. Ali Sajid Imami
> > LinkedIn <https://secure-web.cisco.com/1BTO_aZ7cH_8TaD11HyS10Fduxb3co4BqlJudIfzXykrcywobw2n0xsaOdEHdvKApkBAn1ZVq-dlLlBONRSk8O2_5L_2haztYIrFMPYFfQChfhTRe52Gdcvaf0lT4FPdRCC_JHpSCVynfXzds9EeIrf7CriylS-Hs59XtvvUZCfme16xvyeOjQgcY8rV_ODwI6TRsELOKgn34D-kyeRmOmAgaK36NoIFnfZ6uC2BufvWY5TsAXS7hD036WGkg8HSeW2GAYCpYrP95GhfcepkC45lkNsGGRLLFbS58VKw4kdp9OB5XG-9YYJC34SM_5vlF/https%3A%2F%2Fpk.linkedin.com%2Fpub%2Fali-sajid-imami%2F50%2F956%2F2a6>
> >
> >         [[alternative HTML version deleted]]
> >
> > _______________________________________________
> > Bioc-devel using r-project.org mailing list
> > https://secure-web.cisco.com/1aIH389Qk-OTABdM2O6WRy3nL87dqGAbww3fvlRUQA1ie32pxTqf1ZNqzSwxT4LBBlZGgr0QEaJEiHj1JJUKtErqRKGsKQpZpnKjrVVRQPTE0tIORp-qF_USGEarsV6aGVvsNkXfJUc-R46vl1kdq1H4TflgSCi37HVdqHBiEwzEdWJ-gctbw92v8xqwORxqzLzv4PLo_qLaou5YH6hoa---kRWCjhAbC92iJJ-wGBp3n2pe8vsduhJsd0IIOOAsSu4YAgqm41T0oLGfuZYdgbBxT_rAg7iDKlHUxMLr0PbGQ_RGclNT-sztwjd0fbIZq/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel
> >
>
>         [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://secure-web.cisco.com/1aIH389Qk-OTABdM2O6WRy3nL87dqGAbww3fvlRUQA1ie32pxTqf1ZNqzSwxT4LBBlZGgr0QEaJEiHj1JJUKtErqRKGsKQpZpnKjrVVRQPTE0tIORp-qF_USGEarsV6aGVvsNkXfJUc-R46vl1kdq1H4TflgSCi37HVdqHBiEwzEdWJ-gctbw92v8xqwORxqzLzv4PLo_qLaou5YH6hoa---kRWCjhAbC92iJJ-wGBp3n2pe8vsduhJsd0IIOOAsSu4YAgqm41T0oLGfuZYdgbBxT_rAg7iDKlHUxMLr0PbGQ_RGclNT-sztwjd0fbIZq/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel
>

--
The information in this e-mail is intended only for the ...{{dropped:18}}

_______________________________________________
Bioc-devel using r-project.org mailing list
https://secure-web.cisco.com/1aIH389Qk-OTABdM2O6WRy3nL87dqGAbww3fvlRUQA1ie32pxTqf1ZNqzSwxT4LBBlZGgr0QEaJEiHj1JJUKtErqRKGsKQpZpnKjrVVRQPTE0tIORp-qF_USGEarsV6aGVvsNkXfJUc-R46vl1kdq1H4TflgSCi37HVdqHBiEwzEdWJ-gctbw92v8xqwORxqzLzv4PLo_qLaou5YH6hoa---kRWCjhAbC92iJJ-wGBp3n2pe8vsduhJsd0IIOOAsSu4YAgqm41T0oLGfuZYdgbBxT_rAg7iDKlHUxMLr0PbGQ_RGclNT-sztwjd0fbIZq/https%3A%2F%2Fstat.ethz.ch%2Fmailman%2Flistinfo%2Fbioc-devel



This email message may contain legally privileged and/or confidential information.  If you are not the intended recipient(s), or the employee or agent responsible for the delivery of this message to the intended recipient(s), you are hereby notified that any disclosure, copying, distribution, or use of this email message is prohibited.  If you have received this message in error, please notify the sender immediately by e-mail and delete this email message from your computer. Thank you.
	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list