297

Some time ago I added info(files) that must be private. Removing from the project is not problem, but I also need to remove it from git history.

I use Git and Github (private account).

Note: On this thread something similar is shown, but here is an old file that was added to a feature branch, that branch merged to a development branch and finally merged to master, since this, a lot of changes was done. So it's not the same and what is needed is to change the history, and hide that files for privacy.

Alexis Wilke
  • 17,282
  • 10
  • 73
  • 131
Marcos R. Guevara
  • 3,808
  • 4
  • 17
  • 40

7 Answers7

321

I have found this answer and it helped:

git filter-branch --index-filter \
    'git rm -rf --cached --ignore-unmatch path_to_file' HEAD

Found it here https://myopswork.com/how-remove-files-completely-from-git-repository-history-47ed3e0c4c35

bschlueter
  • 3,422
  • 27
  • 45
Petro Franko
  • 3,973
  • 1
  • 16
  • 16
  • 33
    Warning: This creates a ton of commits and causes divergence. You probably have to force push after, but I was too scared. – sudo Aug 27 '19 at 05:54
  • 3
    Seconding what @sudo said but this did work for my fresh branch that I accidentally committed `.env` to. Quick and to the point solution. – Joe Scotto Apr 10 '20 at 20:16
  • 2
    Indeed, a simple force push works! I was also scared but backed everything up. – wutBruh Jul 04 '20 at 17:09
  • 8
    You can also specify a range of commits as the last argument. If the commit in question was recent, do `..HEAD` and save some time. – Victor Sergienko Nov 24 '20 at 00:53
  • 2
    after this it works only for me `git push --force` – Sebastian Schmal Jan 04 '21 at 16:43
  • You could see this command [in Git's manpage](https://git-scm.com/docs/git-filter-branch#_exampleshttps://git-scm.com/docs/git-filter-branch#_examples) also, which provides other examples, and some important notes as well. – MAChitgarha Feb 15 '21 at 17:24
  • 2
    Didn't work, and the commits i made are now on the revision history... tried many approaches... – marcolopes May 11 '21 at 22:12
  • @marcolopes I guess you did something wrong, you also have to consider doing `git push --force` – Petro Franko May 12 '21 at 09:54
  • 1
    I did everything just like the examples here and in other sources... and i did the git push --force. Tried many times. No success! File is still there and still in the revision history... :\ Now i have a huge history for that file. – marcolopes May 13 '21 at 20:40
  • 1
    One of those commands you don't understand but it works wonders ! – renatodamas Aug 31 '21 at 17:28
  • 1
    This did not remove the file from my repo, it remains as it is – alper Sep 23 '21 at 13:32
  • 9
    Current versions of Git say this about `filter-branch`: "WARNING: git-filter-branch has a glut of gotchas generating mangled history rewrites. Hit Ctrl-C before proceeding to abort, then use an alternative filtering tool such as 'git filter-repo' (https://github.com/newren/git-filter-repo/) instead. See the filter-branch manual page for more details; to squelch this warning, set FILTER_BRANCH_SQUELCH_WARNING=1." – Ryan Lundy Nov 28 '21 at 06:55
  • 1
    @sudo It did not add even a single commit for me and worked perfectly fine. What are you talking about? A ton of commits?! What am I missing? – aderchox Dec 14 '21 at 20:48
  • 1
    @aderchox it doesn't exactly "add" commits, it rewrites existing ones. Those commits get replaced by new ones, with a different hash number – ChoKaPeek Jan 10 '22 at 16:27
  • Even @sudo was scared, and there's a good reason for it. Had to merge a branch into `main`/`master` by following https://stackoverflow.com/a/4624383/929999 after this. I guess it does the job but as the warning says, do it at your own risk :) – Torxed May 09 '22 at 18:09
120

If you have recently committed that file, or if that file has changed in one or two commits, then I'd suggest you use rebase and cherrypick to remove that particular commit.

Otherwise, you'd have to rewrite the entire history.

git filter-branch --tree-filter 'rm -f <path_to_file>' HEAD

When you are satisfied with the changes and have duly ensured that everything seems fine, you need to update all remote branches -

git push origin --force --all

Note:- It's a complex operation, and you must be aware of what you are doing. First try doing it on a demo repository to see how it works. You also need to let other developers know about it, such that they don't make any change in the mean time.

einpoklum
  • 102,731
  • 48
  • 279
  • 553
hspandher
  • 14,790
  • 2
  • 28
  • 43
  • after rewrite the entire history, for keep the changes to repository (github) what must be done? – Marcos R. Guevara May 03 '17 at 14:30
  • thank you, i will wait for do it, and try it with a demo repository, i will update with all was done here. – Marcos R. Guevara May 03 '17 at 14:49
  • By mistake, I forgot to add `--all`. Now it says everything up-to-date whenever I rerun push with both the arguments. And the file is not removed from other branches. What should I do now? – Reeshabh Ranjan Jun 29 '19 at 11:14
  • 3
    Why does your suggestion use `--tree-filter` rather than `--index-filter` like in @PetroFranko's answer? – einpoklum Jun 24 '20 at 13:10
  • 1
    holy crap, it worked! I mean it was really really simple. I've done it the hard way before, but this was much easier. Tip: the path needs to be relative. – Antebios May 08 '21 at 03:07
  • Didn't work :\ File is still on local repo and after "git push" still on the git remote repository, and the revisions are all there! :\ – marcolopes May 11 '21 at 22:48
  • @einpoklum basically, `tree-filter` rebuilds everything to (and then from) a new (temporary) directory. `index-filter` does this differently (in memory I believe) and is considerably faster. See more here: https://stackoverflow.com/questions/36255221/what-is-the-difference-between-tree-filter-and-index-filter-in-the-git – timhc22 Feb 24 '22 at 02:53
  • It does not seems to be working for me. Doing the same command exactly mentioned here but still I can see the files in remote repo. Anything I am missing here? – Gaurav Parek Mar 28 '22 at 13:15
  • i still see it in history – chovy May 02 '22 at 11:19
82

git-filter-repo

git recommends to use the third-party add-on git-filter-repo (when git filter-branch command is executed). There is a long list of why it is better than any other alternatives (https://github.com/newren/git-filter-repo#why-filter-repo-instead-of-other-alternatives), my experience is that it is very simple and very fast.

This command removes the file from all commits in all branches:

git filter-repo --invert-paths --path <path to the file or directory>

Multiple paths can be specified by using multiple --path parameters. You can find detailed documentation here: https://www.mankier.com/1/git-filter-repo

Chris Ballance
  • 32,977
  • 25
  • 102
  • 150
Tibor Takács
  • 1,959
  • 17
  • 19
43

Remove the file and rewrite history from the commit you done with the removed file(this will create new commit hash from the file you commited):

there are two ways:

  1. Using git-filter-branch:

git filter-branch --force --index-filter 'git rm --cached --ignore-unmatch <path to the file or directory>' --prune-empty --tag-name-filter cat -- --all

  1. Using git-filter-repo:
pip3 install git-filter-repo
git filter-repo --path <path to the file or directory> --invert-paths

now force push the repo: git push origin --force --all and tell your collaborators to rebase.

alper
  • 2,299
  • 4
  • 36
  • 73
suhailvs
  • 17,521
  • 10
  • 95
  • 95
  • 1
    @alper you need to replace `PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA` with the file to remove eg: `README.md` if you want to remove it. – suhailvs Sep 04 '21 at 09:14
  • You need to use `-rf` in order to remove folders – alper Sep 04 '21 at 20:20
  • 6
    For `git filter-repo`: I am getting following message : `Aborting: Refusing to destructively overwrite repo history since this does not look like a fresh clone. (expected freshly packed repo) Please operate on a fresh clone instead. If you want to proceed anyway, use --force.`. If I force it I get following: `fatal: 'origin' does not appear to be a git repository fatal: Could not read from remote repository. ` – alper Sep 06 '21 at 10:46
  • `git filter-branch` worked for me! – Federico Peralta Sep 21 '21 at 04:14
  • 1
    `git filter-branch` approach worked for me on mac, while `filter-repo` approach was removing remote origin – Ilya Sheershoff Jan 25 '22 at 09:28
  • This worked, but I forgot to back up the file first, and now it's gone. :-( – kr37 Apr 30 '22 at 15:43
29

I read this GitHub article, which led me to the following command (similar to the accepted answer, but a bit more robust):

git filter-branch --force --index-filter "git rm --cached --ignore-unmatch PATH-TO-YOUR-FILE-WITH-SENSITIVE-DATA" --prune-empty --tag-name-filter cat -- --all
vancy-pants
  • 792
  • 11
  • 12
10

Using the bfg repo-cleaner package is another viable alternative to git-filter-branch. Apparently, it is also faster...

c1au61o_HH
  • 791
  • 6
  • 13
9
  • First of all, add it to your .gitignore file and don't forget to commit the file :-)

  • You can use this site: http://gitignore.io to generate the .gitignore for you and add the required path to your binary files/folder(s)

  • Once you added the file to .gitignore you can remove the "old" binary file with BFG.


#How to remove big files from the repository

You can use git filter-branch or BFG. https://rtyley.github.io/bfg-repo-cleaner/

###BFG Repo-Cleaner an alternative to git-filter-branch.

The BFG is a simpler, faster alternative to git-filter-branch for cleansing bad data out of your Git repository history:

*** Removing Crazy Big Files***

  • Removing Passwords, Credentials & other Private data

Examples (from the official site)

In all these examples bfg is an alias for java -jar bfg.jar.

# Delete all files named 'id_rsa' or 'id_dsa' :
bfg --delete-files id_{dsa,rsa}  my-repo.git

enter image description here

Michael Mrozek
  • 161,243
  • 28
  • 165
  • 171
CodeWizard
  • 110,388
  • 20
  • 126
  • 153
  • Is it a third party cleaner? – alper Sep 03 '21 at 13:39
  • Is it secure to use? – alper Sep 04 '21 at 22:08
  • Indeed, a very "old" tool which is being used by the community for few years. The source is in GitHub so you and the community can browse it. – CodeWizard Sep 05 '21 at 13:59
  • I just find out that GitHub does not remove deleted commits in case when users request them to run garbage collector, (https://stackoverflow.com/questions/34582480/remove-commit-for-good/34594815#34594815). I am just get lost where when we use 3rd party tools like GitHub whatever committed, we will always need to ask them to remove it, which is not cool – alper Sep 05 '21 at 16:56