[Bioc-devel] Big repo problem & consequences: all commits duplicated

O'CALLAGHAN Alan A@B@O'C@||@gh@n @end|ng |rom @m@@ed@@c@uk
Mon Oct 14 16:16:20 CEST 2019

Hi Jelena,

If you used BFG and then pull --allow-unrelated-histories, the large files are again present in the repo. Thanks to delta compression<https://en.wikipedia.org/wiki/Delta_encoding>, the duplication of commits itself should not massively increase repository size, though it is not desirable as it obfuscates the repo history.

You will probably have to use BFG or git filter-branch again to remove the large files. You could then use git rebase --interactive in order to edit the commit history to remove any duplicated commits that remain afterwards. I would advise that this is very likely to produce a lot of merge conflicts. You can find many guides online on how to rebase.

To remove the files (and commits) from the remote(s) (Github/Bioconductor), you will then need to force push the "filtered" version of the repository, and to ensure that all other users with push access to the repositories update all local copies of the repository to the "filtered" version (eg, using git fetch origin && git reset --hard origin/master). If you don't do this, the large files will be added back in when users push to the remote(s).

I would strongly advise that you make a copy of the repository before attempting any of this, so you can revert to the current version if needed.

Best of luck,


On 11/10/2019 16:34, Cuklina Jelena wrote:

Dear all,

while trying to fix the WARNING of “too big files” for my repo proBatch, I used BFG, and ALL my commits (or at least many-many dozens of them) are duplicated (some up to 5-10 copies). I guess this is because of my pull-push efforts, which I naively “resolved” with git pull origin master --allow-unrelated-histories.

What can I do now (because my repo is now massive exactly because everything is in many-many copies).

Best regards,
Jelena Čuklina.

        [[alternative HTML version deleted]]

Bioc-devel using r-project.org<mailto:Bioc-devel using r-project.org> mailing list

The University of Edinburgh is a charitable body, registered in Scotland, with registration number SC005336.

	[[alternative HTML version deleted]]

More information about the Bioc-devel mailing list