[Bioc-devel] Removal of large items in git history - BiocCheck warning

Murphy, Alan E @@murphy @end|ng |rom |mper|@|@@c@uk
Tue Mar 9 10:09:35 CET 2021


Hi both,

Thank you for your suggestions. Yes, I am still having problems with the size of my git history in the EWCE package. To clarify, I have already tried the BFG cleaner to no avail even when I set the max limit to 1 MB (see my first email for details).

The issue is that a .git/objects/pack/ file is still greater than the allotted 5MB, it appears to be 8.9MB in size. As mentioned, I have used the BFG cleaner and yet this still remains too large. If anyone has suggestions on how else I could reduce this size that would be great.

@Nitesh Turaga<mailto:nturaga.bioc using gmail.com> how would I go about checking (and removing?) hidden files from the .git/objects/pack history?

Kind regards,
Alan.
________________________________
From: stefano <mangiolastefano using gmail.com>
Sent: 08 March 2021 22:18
To: Nitesh Turaga <nturaga.bioc using gmail.com>
Cc: Murphy, Alan E <a.murphy using imperial.ac.uk>; bioc-devel using r-project.org <bioc-devel using r-project.org>
Subject: Re: [Bioc-devel] Removal of large items in git history - BiocCheck warning


This email from mangiolastefano using gmail.com originates from outside Imperial. Do not click on links and attachments unless you recognise the sender. If you trust the sender, add them to your safe senders list<https://spam.ic.ac.uk/SpamConsole/Senders.aspx> to disable email stamping for this address.



Hello,

you can use  bfg-repo-cleaner  ,

have a read to this document, in the section "eliminate big files from repo"

https://docs.google.com/document/d/1jxg7KCMQq3kiCcvodQk9JgtU51LqczOwLit1gHiTP4Q/edit?usp=sharing


Best wishes.

Stefano



Stefano Mangiola | Postdoctoral fellow

Papenfuss Laboratory

The Walter Eliza Hall Institute of Medical Research

+61 (0)466452544


Il giorno mar 9 mar 2021 alle ore 09:11 Nitesh Turaga <nturaga.bioc using gmail.com<mailto:nturaga.bioc using gmail.com>> ha scritto:
Hi Alan,

Did you manage to solve this?

There seems to be objects in your git repo which are bigger than the size which is required by Bioconductor for a software package. Please check hidden files as well.

One test you can do is, to clone your package from github and see how much MB are downloaded to this new location. This is a good test to check which files are still larger than the limit.

Best,

Nitesh

On 3/4/21, 11:19 AM, "Bioc-devel on behalf of Murphy, Alan E" <bioc-devel-bounces using r-project.org<mailto:bioc-devel-bounces using r-project.org> on behalf of a.murphy using imperial.ac.uk<mailto:a.murphy using imperial.ac.uk>> wrote:

    Hi all,

    I am working on the development of EWCE<https://github.com/NathanSkene/EWCE> for submission to Bioconductor. I have removed some large objects from the package and moved them to a separate ExperimentHub package however, after their removal, I got a BiocCheck large file warning.

    To deal with the data stored in git history, I followed the instructions to use the BFG cleaner with the max size set to 5MB. This appeared to work and some things were removed but yet I still get the warning below:

    $warning[1] "The following files are over 5MB in size: '.git/objects/pack/pack-366a7ab7a2ba4e656f3a9f3f1408be7ab9f41303.pack'"

    If I try to rerun the BFG cleaner I get the following output:


    Warning : no large blobs matching criteria found in packfiles - does the repo need to be packed?

    I have tried two different methods to using the BFG cleaner, one from BFG<https://rtyley.github.io/bfg-repo-cleaner/> themselves and one from Bioconductor<https://bioconductor.org/developers/how-to/git/remove-large-data/>. I have also completed all steps in both including the prune step:


    git reflog expire --expire=now --all && git gc --prune=now --aggressive

    I have even tried reducing the max from 5MB to 1MB but still nothing seems to be left eve at that size. Does anyone know of another way to sort this issue or have any clue what I may be doing wrong?

    Kind regards,
    Alan.

    Alan Murphy
    Bioinformatician
    Neurogenomics lab
    UK Dementia Research Institute
    Imperial College London

        [[alternative HTML version deleted]]

    _______________________________________________
    Bioc-devel using r-project.org<mailto:Bioc-devel using r-project.org> mailing list
    https://stat.ethz.ch/mailman/listinfo/bioc-devel
_______________________________________________
Bioc-devel using r-project.org<mailto:Bioc-devel using r-project.org> mailing list
https://stat.ethz.ch/mailman/listinfo/bioc-devel

	[[alternative HTML version deleted]]



More information about the Bioc-devel mailing list