[Bioc-devel] BFG repo cleaner did not perfectly work
Nathan Sheffield
n@he|| @end|ng |rom d@t@b|o@org
Fri Jan 20 18:23:23 CET 2023
Hi Adam,
I think the recommended way to remove large, inadvertently committed files from a git repo is no longer BFG or filter-branch, but a new approach called `filter-repo`. You might try it. You can read about it here: https://github.com/newren/git-filter-repo
I've found it easier to use and more effective and faster than BFG or git filter-branch. For example I have this in my notes...
First, use this script to identify large files:
```
git rev-list --objects --all \
| git cat-file --batch-check='%(objecttype) %(objectname) %(objectsize) %(rest)' \
| sed -n 's/^blob //p' \
| sort --numeric-sort --key=2 \
| cut -c 1-12,41- \
| $(command -v gnumfmt || echo numfmt) --field=2 --to=iec-i --suffix=B --padding=7 --round=nearest
```
Then I use this to remove the files from history. As of 2020, `filter-repo` has replaced `filter-branch` and `bfg` as the recommended way to change history, but it's a separate tool that you'll have to install (with *e.g.* `pip3 install git-filter-repo`).
```
git filter-repo --path-glob '*.RData' --invert-paths
```
Hope that helps.
-Nathan
On Mon, Jan 16, 2023, at 11:48 AM, Park, Adam Keebum wrote:
> Dear community,
>
> This is a compact version of the same issue I sent last week, for asking a general advice.
>
> * Running the recommended command below did not perfectly remove every such file.
>
> bfg --strip-blobs-bigger-than 5M repo.git
>
> * The BiocChecker still picks up a pack file and emits a warning (.git/objects/pack-xxx..xxx.pack).
>
> * However, the reference is not detected by tools like git-branch-filter or bfg.
>
> I would appreciate any kinds of an advice for digging into this problem.
>
> Sincerely,
> Adam.
>
> [[alternative HTML version deleted]]
>
> _______________________________________________
> Bioc-devel using r-project.org mailing list
> https://stat.ethz.ch/mailman/listinfo/bioc-devel
>
[[alternative HTML version deleted]]
More information about the Bioc-devel
mailing list