GIT Bootcamp: Branching and Merging

Back to GIT! Just to have a quick recap of the things we’ve seen in the first part of our GIT deep dive, I am going to create a brand new repo, some files and commit everything:

$ mkdir myrepo2
$ cd myrepo2

$ git init
Initialized empty Git repository in /myrepo2/.git/

$ touch file1
$ touch license_agreement
$ touch installer.exe
$ touch hello.lib

$ git status
On branch master
Initial commit
Untracked files:
(use “git add <file>…” to include in what will be committed)

     file1
     hello.lib
     installer.exe
     license_agreement

nothing added to commit but untracked files present (use “git add” to track)

$ git add *

$ git status
On branch master
Initial commit
Changes to be committed:
(use “git rm –cached <file>…” to unstage)

     new file:   file1
     new file:   hello.lib
     new file:   installer.exe
     new file:   license_agreement

$ git commit -m “Creating my project”
[master (root-commit) ac129d8] Creating my project
Committer: Alexandra <alexandra@networkingdom.net>
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly. Run the
following command and follow the instructions in your editor to edit
your configuration file:
git config –global –edit
After doing this, you may fix the identity used for this commit with:
git commit –amend –reset-author
4 files changed, 0 insertions(+), 0 deletions(-)
create mode 100644 file1
create mode 100644 hello.lib
create mode 100644 installer.exe
create mode 100644 license_agreement

$ git status
On branch master
nothing to commit, working directory clean

$ git log
commit ac129d894e0215d5ced15dfdc3b64ed89a8ce2f6
Author: Alexandra <alexandra@networkingdom.net>
Date:   Fri Jan 15 16:08:09 2016 +0000

Creating my project

Before moving on, I want to get rid of the long and annoying output from when I commit something, so I am just going to configure my GIT identity. Should’ve done this from the beginning, but it’s never too late:

The GIT identity is in the GIT config file, so first let’s have a look at this one:

$ git config –list
core.repositoryformatversion=0
core.filemode=true
core.bare=false
core.logallrefupdates=true
core.ignorecase=true
core.precomposeunicode=true

Totally un-interesting output for our truly basic needs, so we’re just going to ignore it, and just notice there is no identity defined at the moment. So, let’s define it:

$ git config –global user.name “Alexandra”
$ git config –global user.email alexandra@networkingdom.net

Nice and smooth. Good, now let’s get to the fun part.

Branching

First question is why do I need branches? The answer is really simple, because you don’t want to mess up the mainline.  We’ve seen that when a commit is executed, GIT actually stores a snapshot of the entire project, so if you actually modify something in a really wrong way, you are most definitely not sleeping well tonight J

To really understand the way Git does branching, we need to understand how Git stores its data. We’ve talked about hashes and blobs but let’s get a bit deeper into this. We know that Git stores data as a series of snapshots, so when you make a commit, Git creates and stores an object, called “commit object”.

This object contains:

  • a pointer towards the snapshot you just took
  • author name and email
  • the commit message
  • pointer(s) to the parent commit(s): a commit object can have
    • 0 parents if it’s the initial commit
    • 1 parent if it’s a regular commit (the parent commit is the commit just before the current commit)
    • 2 or more parents if it’s a commit that result from merging 2 or more branches (we’ll get to this a bit later)

Now, let’s assume our git repo containing the 4 files we added in our first commit. When we staged the files, using git add * , for each of them, GIT creates a checksum and stores the version of file in the repo as blobs, and adds the checksum to the staging area When we commit the staged changes, using git commit, Git checksums the root project directory and stores those tree objects in the Git repository (if we would have had multiple directories, GIT would have check-summed all of them).

Your Git repository now contains six objects: one blob for the contents of each of your three files, one tree that lists the contents of the directory and specifies which file names are stored as which blobs, and one commit with the pointer to that root tree and all the commit metadata.

$ git log -p
commit ac129d894e0215d5ced15dfdc3b64ed89a8ce2f6
Author: Alexandra <alexandra@networkingdom.net>
Date:   Fri Jan 15 16:08:09 2016 +0000

Creating my project

diff –git a/file1 b/file1
new file mode 100644
index 0000000..e69de29

diff –git a/hello.lib b/hello.lib
new file mode 100644
index 0000000..e69de29

diff –git a/installer.exe b/installer.exe
new file mode 100644
index 0000000..e69de29

diff –git a/license_agreement b/license_agreement
new file mode 100644
index 0000000..e69de29

For any subsequent commits, there will be one more info stored, which is the pointer to the parent(s) commit(s).

What is a BRANCH?

A branch is kind of like a pointer to one of these commits. The default GIT branch is called “master” and is the one you implicitly add commits to. As you add commits, the “master” branch moves forward automatically and always points to the last commit. The “master” It is exactly like any other branch. The only reason nearly every repository has one is that the git init command creates it by default.

Creating a new branch

When you create a new branch, you actually create a new pointer. When you add commits to the new branch, the pointer will move automatically to the latest commit on that branch, without affecting the pointer on the master branch.

See it in action

Let’s see the GIT status of our project:

$ git log
commit ac129d894e0215d5ced15dfdc3b64ed89a8ce2f6
Author: Alexandra <alexandra@networkingdom.net>
Date:   Fri Jan 15 16:08:09 2016 +0000

We have the master branch, with only the initial commit. Let’s make another one, just to create some history.

$ vi license_agreement

$ git status
On branch master
Changes not staged for commit:
(use “git add <file>…” to update what will be committed)
(use “git checkout — <file>…” to discard changes in working directory)

      modified:   license_agreement

no changes added to commit (use “git add” and/or “git commit -a”)

$ git commit -a -m “License agreement update”
[master d53840d] License agreement update
1 file changed, 1 insertion(+)

The command git commit –a is simply a shortcut that eliminates the need to execute the git add command to stage your changes.. We now have 2 commits on the master branch, and the pointer is now on the last commit. Now, if we create a branch:

$ git branch updates_01/17

Hint: you can name your branches whatever you like, but there’s a good reason to try and do it reasonably useful 🙂

We created a branch, now what? First of all, creating a branch means we created a pointer. This updates_01/17 pointer together with the master pointer, they both are currently pointing at the same commit, which is the “License agreement update”.  Let’s see the branches we now have (even if we created only one branch, we have 2 because the master is a branch as well):

$ git branch
* master
updates_01/17

The git branch command lists all your branches and shows you which branch you’re currently working on. GIT knows on which one you’re located because it has the HEAD, which is also a pointer that gets the value of the last commit on your current branch. A really useful command here is git log –decorate:

$ git log –decorate –oneline
d53840d (HEAD -> master, updates_01/17) License agreement…
ac129d8 Creating my project

We notice that just by creating a branch, you don’t automatically switch the HEAD to that branch. To do this you need to checkout the branch you want to get to:

$ git checkout updates_01/17
Switched to branch ‘updates_01/17’

$ git branch
master
* updates_01/17

Cool, now let’s get to the fun stuff. Consider modifying a file from the newly created branch:

$ vi file1

$ cat file1
This is the change I made during the 01/17 updates

$ git status
On branch updates_01/17
Changes not staged for commit:
(use “git add <file>…” to update what will be committed)
(use “git checkout — <file>…” to discard changes in working directory)

      modified:   file1

no changes added to commit (use “git add” and/or “git commit -a”)

$ git commit -a -m “Change 1 01/17 – added function to file1”
[updates_01/17 f40239f] Change 1 01/17 – added function to file1
1 file changed, 1 insertion(+)

$ git status
On branch updates_01/17
nothing to commit, working directory clean

The status of the pointers now is:

$ git log –decorate –oneline –graph –all
* f40239f (HEAD -> updates_01/17) Change 1 01/17 – added …
* d53840d (master) License agreement update
* ac129d8 Creating my project

So the master keeps pointing to the last commit on the mainline, while the head points to the last commit on the updates_01/17 branch. If we now checkout the branch, moving back to master, we can see the HEAD moving to the last commit on the master branch:

$ git checkout master
Switched to branch ‘master’

$ git log –decorate –oneline –graph –all
* f40239f (updates_01/17) Change 1 01/17 – added function to file1
* d53840d (HEAD -> master) License agreement update
* ac129d8 Creating my project

It’s really important to understand that when you find yourself on a branch, you have access to the files in you working directory in the form they were last committed on that branch. So now that we’re back to master, we won’t be seeing the changes we did and committed in file1 on the updates_01/17 branch.

$ cat file1
<empty>

Let’s now modify the same file1 on the master branch, but with a different content. This will lead to a conflict later on.

$ vi file1
$ cat file1
These are the changes I created on the master branch

$ git status
On branch master
Changes not staged for commit:
(use “git add <file>…” to update what will be committed)
(use “git checkout — <file>…” to discard changes in working directory)

modified:   file1

no changes added to commit (use “git add” and/or “git commit -a”)

$ git commit -a -m “Feature update file1 – master”
[master ef62ffa] Feature update file1 – master
1 file changed, 1 insertion(+)

$ git log –decorate –oneline –graph –all
* ef62ffa (HEAD -> master) Feature update file1 – master
| * f40239f (updates_01/17) Change 1 01/17 – added function to f1
|/
* d53840d License agreement update
* ac129d8 Creating my project

We can now see the actual tree that’s being built from commits and branches. Because a branch in Git is in actuality a 40 character SHA-1 checksum of the commit it points to, branches are quite cheap to create and destroy.

A shortcut to create and switch to a branch at the same time is:

$ git checkout -b test
Switched to a new branch ‘test’

So now that branching is clear enough, let’s talk about merging

Merging

Well, you can all imagine what merging is about. Let’s imagine you created a branch, did whatever changes were needed to the files, and you you want to bring your changes into the mainline. To do that, you need to merge your branch into the master branch. This can be either really easy, “fast-forward” way, if there were no commits on the master branch since you created your branch, or it can result into conflicts if you happen to have modified the same files in the same places both on the mainline and on your branch. In our current status, a merge is clearly going to create a conflict, as we modified file1 on both branches, so we’re going to start with the undesirable case (simply because it’s the most common one J ) and move on with the simpler ones.

Firstly, let’s remove the test branch:

$ git branch -d test

Worst case – We have conflicts

Now, we move to the master branch and try to merge the updates_01/17 branch into it:

$ git checkout master
Switched to branch ‘master’

$ git merge updates_01/17
Auto-merging file1
CONFLICT (content): Merge conflict in file1
Automatic merge failed; fix conflicts and then commit the result.

$ git status
On branch master
You have unmerged paths.
(fix conflicts and run “git commit”)

Unmerged paths:
(use “git add <file>…” to mark resolution)

both modified:   file1

no changes added to commit (use “git add” and/or “git commit -a”)

So, GIT prevents us from merging the two branches simply because it can’t make the decision which change to keep. But you can decide, simply by editing the conflicting parts of the file and deciding which parts to keep:
Git lets you know what you messed up, and where, by marking it in the file:

$ cat file1
<<<<<<< HEAD
These are the changes I created on the master branch
=======
This is the change I made during the 01/17 updates
>>>>>>> updates_01/17

GIT’s problem is that a different part of text is on the same line in the 2 branches. Let’s say we want to keep both changes, but they cannot be definitely on the same line, so we just remove GIT’s comments and keep the changes in whichever order we want:

$ cat file1
These are the changes I created on the master branch
This is the change I made during the 01/17 updates

$ git status
On branch master
You have unmerged paths.
(fix conflicts and run “git commit”)

Unmerged paths:
(use “git add <file>…” to mark resolution)

  both modified:   file1

no changes added to commit (use “git add” and/or “git commit -a”)

$ git commit -a -m “solved conflicts”
[master 333d553] solved conflicts

$ git log –decorate –oneline –graph –all
* e157aae (HEAD -> master) resolved conflicts
|\
| * f40239f (updates_01/17) Change 1 01/17…
* | 868e14f Feature update file1 – master
|/
* d53840d License agreement update
* ac129d8 Creating my project

Once you merged your changes to the master, you can delete the branch:

$ git branch -d updates_01/17
Deleted branch updates_01/17 (was f40239f).

$ git log –decorate –oneline –graph –all
* e157aae (HEAD -> master) resolved conflicts
|\
| * f40239f Change 1 01/17 – added function to file1
* | 868e14f Feature update file1 – master
|/
* d53840d License agreement update
* ac129d8 Creating my project

No conflicts but commits

We mentioned that there was the possibility to have commits on the master after you’ve created the branch and started committing you work on it. So you need to get the changes that were made in the meantime on master into your branch before merging it, or let git do all the work.

If we let GIT do the merge, it will simply do a 3-way merge, between the last commit on master, the last commit on you branch and their common ancestor (the point at which your branch was created)

So let’s do a new branch, with 2 commits, 1865088 and 819f315

$ git log –decorate –oneline –graph –all
* 1865088 (HEAD -> git_does_the_job) modified hello.lib line1
* 819f315 modified installer.exe with description
| * 08ad9bf (master) added terms and conditions statement
|/
* e157aae resolved conflicts
|\
| * f40239f Change 1 01/17 – added function to file1
* | 868e14f Feature update file1 – master
|/
* d53840d License agreement update
* ac129d8 Creating my project

In the meantime, we added a commit on the mainline as well 08ad9bf
We’ve modified totally different files so that no conflicts would occur. Now let’s switch to master and attempt the merge:

$ git merge git_does_the_job
Merge made by the ‘recursive’ strategy.
hello.lib     | 1 +
installer.exe | 1 +
2 files changed, 2 insertions(+)

$ git log –decorate –oneline –graph –all
* f46510e (HEAD -> master) Merge branch ‘git_does_the_job’
|\
| * 1865088 (git_does_the_job) modified hello.lib line1
| * 819f315 modified installer.exe with description
* | 08ad9bf added terms and conditions statement
|/
*   e157aae resolved conflicts
|\
| * f40239f Change 1 01/17 – added function to file1
* | 868e14f Feature update file1 – master
|/
* d53840d License agreement update
* ac129d8 Creating my project

A new commit is created, f46510e, from the 3 inputs we already defined. Nice and smooth.
The other way to do this, rebasing, will be part of the next chapter 🙂

The easiest way out

This can be described as boring. You create your branch, no commits on the master in the meantime, you merge your branch and everyone is happy. My tree may not look exactly like yours as I created some conflicts in the meantime J

$ git checkout -b easy
Switched to a new branch ‘easy’

$ vi hello.lib

$ git commit -a -m “lib entry 4”
[easy 7983c4d] lib entry 4
1 file changed, 1 insertion(+)

$ vi hello.lib

$ git commit -a -m “lib entry 5”
[easy eb29851] lib entry 5
1 file changed, 1 insertion(+)

$ git log –decorate –oneline –graph –all
* eb29851 (HEAD -> easy) lib entry 5
* 7983c4d lib entry 4
* ee13b13 (master) Merge branch ‘git_does_the_job’
|\
| * 127d3f6 one more commit on branch
| * 900ea12 one commit on branch
*   41e2c6f solved other conflicts
|\ \
| |/
| * a28c286 modified installer again
| * de42651 modified license agreement again
*   f46510e Merge branch ‘git_does_the_job’
|\ \
| |/
| * 1865088 modified hello.lib line1
| * 819f315 modified installer.exe with description
* | 08ad9bf added terms and conditions statement
|/
* e157aae resolved conflicts
|\
| * f40239f Change 1 01/17 – added function to file1
* | 868e14f Feature update file1 – master
|/
* d53840d License agreement update
* ac129d8 Creating my project

$ git checkout master
Switched to branch ‘master’

$ git merge easy
Updating ee13b13..eb29851
Fast-forward
hello.lib | 2 ++
1 file changed, 2 insertions(+)

Cool. Branching and merging are fun right? 🙂

Friendly advice: please just take a few hours and play with it, try whatever comes to your mind and see what happens. Don’t wait and do it for the first time on a live project, if you care about your derrière not being kicked…

Cheers!

Facebook Comments
Rating

Leave a Reply