Git Prune

Learn via video courses
Topics Covered

Overview

The Git prune command is much like an internal housekeeping utility that cleans up unreachable Git objects. It is useful when there are a lot many files or commits in our working directory that we don't want to keep. If an object or commit is no more reachable from the current branch, git prune will delete that object or commit from the working directory. It is a way to make the directory lighter and easier to track the progress of the project.

Pre-Requisites

  • Basic Git Commands

Introduction to Git Prune

The Git prune command is used to delete all the files that are not reachable from the current branch. Unreachable objects are objects that are inaccessible from any of the available references. The primary use of the Git prune command is to clean the working directory after we finish the work on the project. If an object or commit is no more reachable from the current branch, the git prune command will delete that object or commit from the working directory.

Commands

Let us look at the Git prune commands and their usage.

  1. git fetch -prune The git fetch -prune <remote> command deletes all the objects that are not reachable from the git remote repository.

  2. git remote prune origin The git remote prune origin command deletes all the unreachable objects from the remote repository but it doesn't fetch them.

  3. git config -global fetch.prune true The git config -global fetch.prune true command is used to configure git to prune the remote repository when we fetch it.

Options

The git prune command has a short list of options available as follows:

  1. -n --dry-run The prune command deletes the unreachable git objects. It sometimes may lead to losing important files or objects. Thus, before actually deleting the unreachable files a dry run is used to see what changes will be made when the git prune command will be executed. If all the changes are desired after this run then only the git prune command is executed.

  2. -v --verbose The verbose command will display the output of all objects and actions taken by the git prune command.

  3. --progress The progress command indicates the progress of the prune command.

  4. --expire It specifies a time, the objects that are created before the given time are forcefully deleted or marked expired.

Example

Let us stimulate a scenario where a commit becomes unreachable. The detached commit is not present in the git log and needs to be deleted using the git prune command.

Creating a New Repository and Initializing it

Let us first begin with creating a repository and adding a few files and some changes to them.

Creating a New Repository and Initializing it

We are first adding the folder as a git repository using the git init command. Next, we are creating a text file using the echo command and add it to the staging area using the git add command. Lastly, we commit the file to the repository using the git commit command.

Modify the File hello.txt and Create a New Commit

Let us now make a few modifications to the file and commit these changes to the repository.

Modify the File hello.txt and Create a New Commit

The git status command shows the files that are modified but not committed to the working directory. We have modified the file by adding a line of text and a new commit is made to the directory.

Making commit unreachable from the current branch

Now, we will look into the commit history and make a commit unreachable. The git log command returns the history of the commits made on the repository. Git reset command is used to remove all the commits made after the specified commit forcefully and preserves the changes locally.

Making commit unreachable from the current branch A

We are first making one more commit to the file and getting the log history for the repository. The git log command displays the history of commits which helps us track the progress of the project.

Making commit unreachable from the current branch B

The log displays the commits. Then we are using the reset command with the --hard option. This will reset the state of the repository back to the second commit. All the commits made after the specified commits will become unreachable.

Thus, the third commit is unreachable now but changes made by the third commit to the file are still present. Next, when we again run the git log command we see that only the first two commits are listed in the log and the third commit is now unreachable from the master branch.

Running git prune

Let us finally see how we can remove this commit using the internal housekeeping utility of git i.e. git prune command.

Running git prune A

The git log command doesn't have the record of the third commit. Now on running the git checkout command for the unreachable commit using its commit code, it shows that we are currently on the 'detached HEAD' i.e. this commit is no longer connected to the main branch or any other branch connected to the main branch. Any changes to this branch will not affect the main branch.

Running git prune B

As we can see the head is the detached branch and the master branch is separate from it. We will be performing checkout on the master branch to switch back to the master branch.

Running git prune C

A warning is displayed saying that the current branch is not connected to the master branch and we are leaving it.

Next, let us dry run the git prune command to see the changes made to the project after the execution. The git prune command along with --dry-run and --verbose displays output indicating what is set to be pruned. It doesn't prune anything.

git prune --dry-run --verbose

This command most likely will not return any output. Empty output implies that the prune will not delete anything from the directory. This happens because somewhere Git still maintains a reference to it. This shows that git prune is not used stand-alone outside git gc. This is a good instance of how hard it is to fully lose any data or reference with Git.

Note: Git reflog command can be used to track the sequence of actions taken. Git log also has internal expiration dates on when it will prune detached git objects.

Running git prune D

git reflog expire --expire=now --expire-unreachable=now --all

This command will force expire all the entries in the reflog that are older than now i.e. all the previous entries. This command is risky and must not be used by anyone or very frequently. With all the data wiped from the reflog, our repository is ready for executing git prune.

Does Git Remote Prune Origin Delete the Local Branch?

No, the Git remote prune origin will only delete the refs to the remote branch that is unreachable or no longer exists. A repository still has local/origin and remote/origin ref collections. This safely leaves local work in local/origin.

What’s the Difference Between Git Prune, Git Fetch --prune, and Git Remote Prune?

The git remote prune and git fetch --prune performs the same task of deleting refs to branches that don't exist on the remote repository. This is highly desirable while working with the team and its workflow where remote branches are deleted after they are merged into the master branch. The git fetches --prune connects to the remote and fetches the latest remote state before pruning. The generic git prune deletes detached commits. The git generic prune command is entirely different from the other git prune commands.

Conclusion

  • The git prune command deletes the unreachable git objects from the working directory.
  • Often git objects are unreachable and need to be deleted or cleaned up after committing changes to the master branch. This helps in keeping project history concise and avoiding undesirable references.
  • Git prune commands include git fetch -prune <remote> that deletes unreachable objects from the remote repository, git remote prune origin command that deletes unreachable objects from the remote repository without fetching, and git config -global fetch.prune true command, used to configure git to prune the remote repository while fetching.
  • The git prune command also provides various options from the pruning git repository such as -n --dry-run, -v --verbose, `--progress, etc.
  • Git pruning consists of several steps to be followed. The first step is to run the git log command and git reflog command to check the actions performed. Next, we perform the dry run on the specific branch to check the changes performed on executing the git prune command. If all the changes are desirable, with utmost care git prune command can be executed. Lastly, git reflog is cleared using the git reflog expire command.