Git Submodule

Learn via video courses
Topics Covered

Overview

By using git submodules, you can keep a git repository within another git repository as a subdirectory. With Git submodules, you can clone another repository into your project while keeping your commits separate. Git submodules are just references to other repositories at a given moment in time. Git submodules provide a way for Git repositories to incorporate external code and track its version history.

Pre-requisites

A version control system, also known as a VCS, is one of the most crucial tools every developer should have. Git is one of the most widely used version control systems among developers worldwide.

Git is open-source software, which means it may be adjusted to meet particular needs and is free to use. Additionally, it has a lot of great features that attempt to speed up the project process while also facilitating collaboration between teams and individual developers.

Git is simple to use and understand, but it's also a skill that's in high demand.

You must have:

  • Git installed 
  • Access to a Git repository hosted by another service or a GitHub account 

What is a Git Submodule?

A git repository frequently depends on outside code. There are numerous ways to include this outside code. It is possible to directly copy and paste external code into the main repository. The disadvantage of this approach is that any upstream repository alterations of the external repository are lost. Using a language's package management system, such as NPM, is another way to incorporate external code. The disadvantage of this approach is that it necessitates installing and version control at each location where the original deployed code is. Both of the proposed ways do not provide for the tracking of edits and updates to the external repository.

Git submodules are records in a host git repository that point to a particular commit in a different repository. Submodules are highly static and only keep track of certain commits. Submodules are not automatically updated when the host repository is modified, and neither git refs nor branches are tracked. A new .gitmodules file is generated whenever a submodule is added to a repository. The .gitmodules file contains metadata about the mapping between the URL of the submodule project and the local directory. The .gitmodules file contains an entry for each submodule if the host git repository has more than one.

Use Cases for Git Submodules

Git submodules are typically used when your project grows in complexity, and while your project is dependent on the main Git repository, you may want to keep their change history separate.

Suppose that I want to use a library's source code, but I don't want to use it in its current form. I might wish to alter it, but doing so could result in challenging problems like having to incorporate the library's pulled changes. For instance, I wouldn't be able to quickly incorporate new modifications from the library's codebase updates if I replicated the codebase into my project. In this situation, I would utilise a Git submodule so that I could have my project depend on the library while keeping my project's actions independent of the library.

Commands for Git Submodules

In Git, when you add a submodule, you just add information about the submodule; the submodule's actual code is not committed to the main repository. This information specifies the commit that the submodule points at. By doing this, if the repository for the submodule is modified, the code for the submodule won't be updated immediately. This is advantageous because it avoids unexpected behaviour in the case that your code may not work with the submodule's most recent commit.

Adding a Submodule

The git submodule add command allows us to add a fresh submodule to an already-existing repository. We will update this new repository using a submodule.

The command git submodule add accepts a URL parameter pointing to a git repository. The awesomelibrary has been included here as a submodule. The submodule will be immediately cloned by Git. Now, using git status, we can examine the repository's present state.

When you run git status after this action, two files will appear in the Changes to be committed list: the path to the submodule and the .gitmodules file. When you push and commit these files, you also push and commit the submodule to the origin.

Getting the Submodule's Code

Git submodule init by default copies the mapping from the .gitmodules file into the local ./.git/config file.

This can seem pointless and make the git submodule init's utility look doubtful. The extend behaviour of the git submodule init command allows it to accept a list of explicit module names. This makes it possible to set up a workflow in which only specific submodules necessary for working on the repository are activated. This is helpful if a repository contains a large number of submodules, but not all of them are required to be retrieved for the task at hand.

Git Submodule Update

With the git submodule update command, you can update the project's submodules' state:

The git submodule update command updates the directory tree, retrieves any fresh remote commits, and clones the submodules that are missing. It is not necessary to run git submodule init when the --init flag is added to the command. The --recursive option instructs Git to update nested submodules and review the submodules for them.

Git Submodule Status

Type the git submodule status command to check the submodules' status:

Each submodule's path is listed along with the SHA-1 in the command's output.

There are three different prefixes for the SHA-1 string.

  • An uninitialized submodule is identified by the - prefix.
  • The + symbol indicates that the checked-out submodule commit varies from the configuration of the original submodule repository.
  • The U prefix signals merge conflicts.

Git Submodule Deinit

You can unregister a submodule by entering the command:

The section of the .git/config file that is pertinent to the submodule is deleted together with the contents of the submodule directory.

How to Update Git Submodules?

A developer quickly becomes used to the push and pull interactions needed to obtain the most recent source code from the master branch or the most recent commit on the development branch. Git submodule updates, however, are a little bit more challenging because the git commit you to refer to in the parent module isn't always the most recent code on that branch.

Steps to Update Git Submodules

To keep your workspace's Git submodules up to date with the latest server commits, follow these steps:

  • If you haven't previously, clone the remote repository.
  • Execute the command for git submodule update -remote.
  • Any new files extracted from the repository should be added to the Git index.
  • Apply git commit.
  • Push back to origin.

Update Git Submodules Example

Use the following commands to carry out the example of updating git submodules on your local computer:

Working with Repositories that Contain Submodules

Cloning a submodule-containing repository: Use the —recursive flag to clone a repository together with all of its submodules.

Downloading multiple submodules at once - A repository could contain a lot of submodules, thus downloading all of them at once might take some time. Because of this, the --jobs argument is supported by the clone and submodule update commands to fetch many submodules simultaneously.

Pulling with submodules - Once the submodules are configured, you can update the repository, as usual, using fetch/pull. Utilize the —recurse-submodules and the —remote parameters in the git pull command to pull everything, including the submodules.

Executing a command on every submodule - Git has a command that enables us to run any shell command on each submodule. The --recursive argument is supported to enable execution in nested subprojects. For the example below, we suppose that we wish to reset every submodule.

Creating Repositories with Submodules

Adding a Submodule to a Git Repository and Tracking a Branch

By using the -b argument of the submodule add command, you can tell the submodule add command which branch should be tracked when adding a submodule. If the local configuration file does not already exist for the submodules, the git submodule init command generates it.

Adding a Submodule and Tracking Commits

As an alternative to tracking a branch, you can decide which submodule commit should be used. In this instance, the configured submodules' respective configured commits are tracked by the parent Git repository. A submodule update pulls that exact revision from the submodule's Git repository. This activity is often done after you pull a change from the parent Git repository which updates the revision that is currently checked out in the submodule. To check out the most recent revision referred to in the parent repository, you would then update the submodule after fetching the most recent changes from its Git repository. You may also use a submodule update to update your submodule's repository to the most recent commit that is being monitored by the parent Git repository.

This happens frequently when you want to roll back to the commit that the parent repository is tracking after experimenting with other checked-out branches or tags in the submodule. By performing a checkout in the submodule repository and committing the change in the parent repository, you can also change the commit that is checked out in each submodule. The git submodule add command is used to add a submodule to a Git repository.

Updating Which Commit You are Tracking

The primary repository establishes the relevant state for the submodules. If you commit to your main repository, this commit also defines the status of the submodule.

By using the git submodule update command, you can update the submodule's Git repository to a certain commit. The submodule repository, which is nested within the main repository, keeps track of its content. The nested submodule repository's commit is referred to in the main repository.

To configure the submodules to the commit supplied by the main repository, use the git submodule update command. This means that anytime you pull in new changes to the submodules, you must create a new commit in your main Git repository to keep track of the changes to the nested submodules.

The example that follows demonstrates how to update a submodule to the most recent commit in the master branch.

Example:

Conclusion

  • Git submodules are a strong solution to using git as a tool for managing external dependencies.
  • Git submodules are an advanced feature that may require a learning curve for teammates to accept, therefore weigh the advantages and disadvantages of adopting them before employing them.
  • If you have collaborators make sure everyone who works with the submodule downloads and updates the contents so they have the most recent version. This is crucial since updates made to the submodule's repository by one person are not visible to the collaborators.
  • The main repository does not keep track of changes in submodules. To do this, you must switch to the directory where the modifications were performed.