GitSubmoduleTutorial

From Git SCM Wiki
Jump to: navigation, search

OBSOLETE CONTENT

This wiki has been archived and the content is no longer updated. Please visit git-scm.com/doc for up-to-date documentation.

Git Submodule Tutorial

Submodule support has been available in Git since version 1.5.3. This tutorial explains how to create and publish a repository with four submodules using the git-submodule(1) command.

Submodules maintain their own identity; the submodule support just stores the submodule repository location and commit ID, so other developers who clone the superproject can easily clone all the submodules at the same revision.

For the purposes of the tutorial, the public repositories will be published under your home directory in ~/subtut/public. Let's create the four public submodule repositories first:

$ mkdir -p ~/subtut/private
$ mkdir -p ~/subtut/public
$ cd ~/subtut/private
$ for mod in a b c d; do
    mkdir $mod
    cd $mod
    git init
    echo "module $mod" > $mod.txt
    git add $mod.txt
    git commit -m "Initial commit, public module $mod"
    git clone --bare . ~/subtut/public/$mod.git
    git remote add origin ~/subtut/public/$mod.git
    git config branch.master.remote origin
    git config branch.master.merge refs/heads/master
    cd ..
done

Now create the public superproject; we won't actually add the submodules yet.

$ cd ~/subtut/private
$ mkdir super
$ cd super
$ git init
$ echo hi > super.txt
$ git add super.txt
$ git commit -m "Initial commit of empty superproject"
$ git clone --bare . ~/subtut/public/super.git
$ git remote add origin ~/subtut/public/super.git
$ git config branch.master.remote origin
$ git config branch.master.merge refs/heads/master

Check out the superproject somewhere private and add all the submodules (note: it's important to give an absolute path for submodules on the local filesystem).

$ cd ~/subtut/private
$ cd super
$ for mod in a b c d; do git submodule add ~/subtut/public/$mod.git $mod; done
$ ls -a
.  ..  .git  .gitmodules  a  b  c  d  super.txt

The "git submodule add" command does a couple of things:

  • It clones the submodule under the current directory and by default checks out the master branch.
  • It adds the submodule's clone path to the ".gitmodules" file and adds this file to the index, ready to be committed.
  • It adds the submodule's current commit ID to the index, ready to be committed.
$ cat .gitmodules
[submodule "a"]
        path = a
        url = /home/moses/subtut/public/a.git
[submodule "b"]
        path = b
        url = /home/moses/subtut/public/b.git
        ...
$ git status
# On branch master
# Changes to be committed:
#   (use "git reset HEAD <file>..." to unstage)
#
#       new file:   .gitmodules
#       new file:   a
#       new file:   b
#       new file:   c
#       new file:   d
#

Let's take a quick poke around one of the submodule checkouts.

$ cd a
$ ls -a
.  ..  .git  a.txt
$ git branch
* master

It looks just like a regular checkout:

$ cd ..
$ cat .git/config
        ...
        [remote "origin"]
        url = /home/moses/subtut/public/super.git
        fetch = +refs/heads/*:refs/remotes/origin/*
        ...

Commit the superproject and publish it:

$ git commit -m "Add submodules a, b, c, d."
Created commit fc7c350: Add submodules a, b, c, d.
 5 files changed, 16 insertions(+), 0 deletions(-)
 create mode 100644 .gitmodules
 create mode 160000 a
 create mode 160000 b
 create mode 160000 c
 create mode 160000 d
$ git push
$ git submodule init

Now look at it from the perspective of another developer:

$ mkdir ~/subtut/private2
$ cd !$
$ git clone ~/subtut/public/super.git
$ cd super
$ ls -a
.  ..  .git  .gitmodules  a  b  c  d  super.txt

The submodule directories are there, but they're empty:

$ ls -a a
.  ..
$ git submodule status
-d266b9873ad50488163457f025db7cdd9683d88b a
-e81d457da15309b4fef4249aba9b50187999670d b
-c1536a972b9affea0f16e0680ba87332dc059146 c
-d96249ff5d57de5de093e6baff9e0aafa5276a74 d

Pulling down the submodules is a two-step process. First run "git submodule init" to add the submodule repository URLs to .git/config:

$ git submodule init
$ git config -l
...
submodule.a.url=/home/moses/subtut/public/a.git

Now use "git submodule update" to clone the repositories and check out the commits specified in the superproject.

$ git submodule update

The submodule directories have been filled:

$ cd a
$ ls -a
.  ..  .git  a.txt

One major difference between "submodule update" and "submodule add" is that "update" checks out a specific commit, rather than the tip of a branch. It's like checking out a tag: the head is detached, so you're not working on a branch.

$ git branch
* (no branch)
  master

If you want to make a change within a submodule, you should first check out a branch, make your changes, publish the change within the submodule, and then update the superproject to reference the new commit:

$ git branch
* (no branch)
  master
$ git checkout master
$ echo "adding a line again" >> a.txt
$ git commit -a -m "Updated the submodule from within the superproject."
$ git push
$ cd ..
$ git add a        # There is a gotcha here.  Read about it below.
$ git commit -m "Updated submodule a."
$ git show
...
diff --git a/a b/a
index d266b98..261dfac 160000
--- a/a
+++ b/a
@@ -1 +1 @@
-Subproject commit d266b9873ad50488163457f025db7cdd9683d88b
+Subproject commit 261dfac35cb99d380eb966e102c1197139f7fa24
$ git submodule summary HEAD^
* a d266b98...261dfac (1):
  > Updated the submodule from within the superproject.

$ git push

Switch back to the other private checkout; the new change should be visible.

$ cd ~/subtut/private/super
$ git pull
$ git submodule update
$ cat a/a.txt
module a
adding a line again

Gotchas

If you use a forward slash (/) after the submodule name when adding changes to a submodule and updating the container repository to use the latest submodule changes that you have pulled from the remote source:

$ git add submodule/

git will think you want to delete the submodule and want to add all the files in the submodule directory. Please DONT use a forward slash after the submodule name. You must type it like this:

$ git add submodule

Always publish the submodule change before publishing the change to the superproject that references it. If you forget to publish the submodule change, others won't be able to clone the repository:

$ echo i added another line to this file >> a.txt
$ git commit -a -m "doing it wrong this time"
$ cd ..
$ git add a
$ git commit -m "Updated submodule a again."
$ git push
$ cd ~/subtut/private2/super/
$ git pull
$ git submodule update
error: pathspec '261dfac35cb99d380eb966e102c1197139f7fa24' did not match any file(s) known to git.
Did you forget to 'git add'?
Unable to checkout '261dfac35cb99d380eb966e102c1197139f7fa24' in submodule path 'a'

It's not safe to run "git submodule update" if you've made changes within a submodule. They will be silently overwritten:

$ cat a.txt
module a
$ echo line added from private2 >> a.txt
$ git commit -a -m "line added inside private2"
$ cd ..
$ git submodule update
Submodule path 'a': checked out 'd266b9873ad50488163457f025db7cdd9683d88b'
$ cd a
$ cat a.txt
module a

The changes are still visible in the submodule's reflog:

$ git log -g --pretty=oneline
d266b9873ad50488163457f025db7cdd9683d88b HEAD@{0}: checkout: moving to d266b9873ad50488163457f025db7cdd9683d88b
4389b0d8e22e616c88a99ebd072cfebba40797ef HEAD@{1}: commit: line added inside private2
d266b9873ad50488163457f025db7cdd9683d88b HEAD@{2}: checkout: moving to d266b9873ad50488163457f025db7cdd9683d88b

Removal

To remove a submodule you need to:

  1. Delete the relevant line from the .gitmodules file.
  2. Delete the relevant section from .git/config.
  3. Run git rm --cached path_to_submodule (no trailing slash).
  4. Commit the superproject.
  5. Delete the now untracked submodule files.


Personal tools