GIT Bootcamp: Getting things started with GIT

What is GIT?

GIT is a distributed VCS (Version Control System). In a few words, this means that is a system that allows you to keep track of changes made to a file. The files are stored on a server and each contributor has a local copy of them. Most of the times it’s used when dealing with software development environments, because there is usually a team that works on the same set of files. If it weren’t for GIT (or any other similar tool) everyone would probably overwrite changes that everyone else did to the code and madness would break loose. Still, if you’re not a code developer, you can use GIT to help you keep track of your own files, changes, etc. and keep your head clean!

First time I ran into GIT I had absolutely no idea what it was. For some time, until I had the time to dig into it, I had a note with commands and “what does it do” for each command. This is a “don’t do it like this”-like story. Most definitely, it’s a mistake to take this path because you’ll get to the point where you’ll screw things up so badly that nothing will be able to save you! (P.S. Trust me, I know 🙂 ).

What is GITHUB?

First of all, GitHub is not GIT. GIT is a System, whilst GitHub is a Hub, as simple as that. There is great confusion when these two term come into the same sentence!

GitHub is “somewhere in the Internet” (a server farm actually), used for storing the files and file structures organized with the help of GIT. We’ll get back to this in no time, but first let’s see how GIT works.

Install GIT

 I won’t be going through the process as it is really neat and well described here , and you can download all you need from here

Let’s start with the basics

We said that no memorization of commands will occur today, so if we want to pull that off, we need to start with the most basic thing about GIT – the file system. A repository represents the data structure that GIT uses to store files and their history. A really important thing to know is that you usually do not delete something from the stored history and that is because if files have dependencies, you mess up the repo.

What makes GIT different from other VCS is the way GIT thinks about its data. CVS, Perforce, Subversion, Bazaar, etc., see the information they keep as a set of files and the changes made to each file over time. GIT thinks of its data more like a set of snapshots. For each commit (something like saving the entire folder), GIT “photographs” all your files at that moment and stores the reference used to access this photo. If a file hasn’t changed from a previous photo, GIT won’t store the file again, but just a link to the previous identical file it already has.

Source: https://git-scm.com
Source: https://git-scm.com

Creating a repository

You can use GITHUB for your repo, or you can create it locally. Generally, if you just want to track your own files, a local one is more that enough, with a backup saved somewhere safe. For the GITHUB process, follow the steps here: https://help.github.com/articles/create-a-repo/

Now, to create a local repo:

  • Create a folder you want to act as the place where you store your files:

$ mkdir myrepo

  • Move to that folder

$ cd myrepo/

  • Initialize the folder as a GIT repo

$ git init
Initialized empty Git repository in /Users/alexandra/myrepo/.git/

First thing’s first, let’s see the status of our repo:

$ git status
On branch master
Initial commit
nothing to commit (create/copy files and use “git add” to track)

Obviously there is nothing in there, so probably the first thing you’ll want to do is create a file (while still in the repo folder):

$ touch file1

Once you created the file, GIT knows and silently wants you to tell him if it should start tracking changes to that file, meaning that if you want to add the file to the repo, or not

$ git status
On branch master
Initial commit
Untracked files:
(use “git add <file>…” to include in what will be committed)
       file1
 nothing added to commit but untracked files present (use “git add” to track)

Presumably we want GIT to start tracking the changes we do to this file, so we add it to the repo:

$ git add file1

If you find yourself lazy when having tons of files to add, you can simply “git add *”. Now, once GIT started tracking the file, let’s see what happens if we start modifying the file:

$ vi file1

Insert “hello world!” and save the content (in Vi, to inset press “i”, once you finished, press “Esc” and to save press “:wq” = write quit)

$ cat file1
Hello World!

$ git status
On branch master
Initial commit
Changes to be committed:
    (use “git rm –cached <file>…” to unstage)
          new file:   file1
Changes not staged for commit:
(use “git add <file>…” to update what will be committed)
(use “git checkout — <file>…” to discard changes in working directory)
          modified:   file1

The nice thing about GIT is that if you read carefully, you might end up understanding something. So, let’s get what we see: GIT started tracking file1, which is a new file added to the repo. Since GIT started tracking the file, some changes happened to it. You must decide if you’re going to stick to the changes, so that GIT can take them for good, or not – checkout the changes. Let’s add the changes:

$ git add file1

$ git status
On branch master
Initial commit
Changes to be committed:
use “git rm –cached <file>…” to unstage)
  new file:   file1

Awesome! Let’s now make our first commit (Initial commit):

$ git commit -m “This is the initial commit”
[master (root-commit) 3b524c0] This is the initial commit
Committer: Alexandra <alexandra@networkingdom.net>
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly. Run the
following command and follow the instructions in your editor to edit
your configuration file:
git config –global –edit
After doing this, you may fix the identity used for this commit with:
git commit –amend –reset-author
1 file changed, 1 insertion(+)
create mode 100644 file1

Ignoring the first part of the message for now, the rest of the message says that we committed a new file (1 new file) that has 1 new line (1 insertion (+)). The mode 100644 states that this is a regular file (not a symlink for example) and that it hasn’t got execute permissions. If we check the status now, it feels right:

$ git status
On branch master
nothing to commit, working directory clean

Now, let’s see the GIT logs:

$ git log
commit 3b524c0a4db6061460eeebe1c06de8d74ec8a3b3
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:10:31 2016 +0000

              This is the initial commit

What we see after the commit is the SHA-1 hash used to identify the respective commit. A GIT tracked file has its contents hashed using the SHA-​1 algorithm, resulting in a 40 (hex) character BLOB (Binary Large Object). When this blob is different than the last blob GIT observed, it registers the file as “modified” and the developer can proceed accordingly (committing the changes to the master or whatever).

To help you visualize what we did:

Source: https://git-scm.com
Source: https://git-scm.com

Let’s change our file again to look like this:

$ cat file1
Hello World!

This is GIT

To make sure we did change the file in the way we wanted, we can check the diff:

$ git diff
diff –git a/file1 b/file1
index 980a0d5..0e0e321 100644
— a/file1
+++ b/file1
@@ -1 +1,3 @@
Hello World!
+
+This is GIT

The first line is the Linux command to perform the diff between two files tracked by GIT. The second one is a “git diff” header. The a/ and b/ filenames are the same unless rename/copy is involved and the –git means that the diff is in the “git” diff format. The index is the shortened hash of the pre-image (the version of the file before the given change) and post-image (the version of the file after the change).

Next, there are some weird letters: these represent the format @@ from-file-range to-file-range @@. The from-file-range = -<start line>,<number of lines>, and to-file-range = +<start line>,<number of lines>. Both start-line and number-of-lines refer to position and length of hunk in preimage and postimage, respectively. If number-of-lines not shown it means that it is 0. In our case we have:

@@ -1 +1,3 @@ = @@ -1,0 +1,3 @@
+ means what was added, – would mean what was removed.

Let’s add the file to GIT:

$ git add file1

$ git status
On branch master
Changes to be committed:
(use “git reset HEAD <file>…” to unstage)
modified:   file1

And then suddenly feel the urge to modify it again:

$ cat file1
Hello World!

This is not GIT

This time we didn’t add or remove lines, we just modified existing info. Checking status (remember there was no commit between the two changes!):

$ git status
On branch master
Changes to be committed:
(use “git reset HEAD <file>…” to unstage)
modified:   file1
Changes not staged for commit:
(use “git add <file>…” to update what will be committed)
(use “git checkout — <file>…” to discard changes in working directory)
modified:   file1

Consequently, the diff will be:

$ git diff
diff –git a/file1 b/file1
index 0e0e321..4246eab 100644
— a/file1
+++ b/file1
@@ -1,3 +1,3 @@
Hello World!

-This is GIT
+This is not GIT

The git diff command shows the diff between the not yet added changes and the added changes. There is also the –cached or –staged option, which will show you the differences between the last commit and the staged changes, not taking into account the unstaged changes:

$ git diff –cached
diff –git a/file1 b/file1
index 980a0d5..0e0e321 100644
— a/file1
+++ b/file1
@@ -1 +1,3 @@
Hello World!
+
+This is GIT

Let’s stage the last change:

$ git add *
$ git status
On branch master
Changes to be committed:
(use “git reset HEAD <file>…” to unstage)
      modified:   file1

And commit the final state of the file:

$ git commit -m “Added content to the file1 file”
[master 0e02dca] Added content to the file1 file
Committer: Alexandra <alexandra@networkingdom.net>
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly. Run the
following command and follow the instructions in your editor to edit
your configuration file:
git config –global –edit
After doing this, you may fix the identity used for this commit with:
git commit –amend –reset-author
1 file changed, 2 insertions(+)

There are 2 insertions as in the first change we added 2 lines and in the second change we just modified one existing line, so for GIT it doesn’t matter how many intermediary changes we had, it just takes the final snapshot.

$ git status
On branch master
nothing to commit, working directory clean

$ git log
commit 0e02dca2c20dd0d390e9c6fde8775169afc5bc46
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:50:11 2016 +0000
Added content to the file1 file

commit 3b524c0a4db6061460eeebe1c06de8d74ec8a3b3
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:10:31 2016 +0000
This is the initial commit

We can notice that in the log, the commits are ordered Most recent à Initial Commit. Each has its own hash and comment. One very useful option is -p, to see differences introduces with each commit (-x limits the output to the last x commits):

$ git log -p
commit 0e02dca2c20dd0d390e9c6fde8775169afc5bc46
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:50:11 2016 +0000
Added content to the file1 file
diff –git a/file1 b/file1
index 980a0d5..4246eab 100644
— a/file1
+++ b/file1
@@ -1 +1,3 @@
Hello World!
+
+This is not GIT

commit 3b524c0a4db6061460eeebe1c06de8d74ec8a3b3
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:10:31 2016 +0000
This is the initial commit
diff –git a/file1 b/file1
new file mode 100644
index 0000000..980a0d5
— /dev/null
+++ b/file1
@@ -0,0 +1 @@
+Hello World!

Also, the –stat option is useful to see a brief description of the changes:

$ git log –stat
commit 0e02dca2c20dd0d390e9c6fde8775169afc5bc46
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:50:11 2016 +0000
Added content to the file1 file
file1 | 2 ++
1 file changed, 2 insertions(+)

commit 3b524c0a4db6061460eeebe1c06de8d74ec8a3b3
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:10:31 2016 +0000
This is the initial commit
file1 | 1 +
1 file changed, 1 insertion(+)

More log options:

–oneline
–short
–full
–fuller

The author of the commit is the person who originally wrote the work, whereas the committer is the person who last applied the work. So, if you send in a patch to a project and one of the core members applies the patch, both of you get credit, as the author, and the core member as the committer.

Undoing what GIT knows

Amending a Commit

First of all, to save ourselves and GIT a ton of commit history, if we forgot to add something to a commit, we can amend that commit. For example, we committed but forgot to add a new file to the project:

So we create a file:

$ touch forgot_this_file
$ git status
On branch master
Untracked files:
(use “git add <file>…” to include in what will be committed)
forgot_this_file
nothing added to commit but untracked files present (use “git add” to track)

We add the file to GIT:

$ git add *

And amend the last commit:

$ git commit –amend -m “Added contents to file1 and forgot_this_file file”
[master 8bc9afc] Added contents to the file1 and forgot_this_file file
Date: Tue Jan 12 23:50:11 2016 +0000
Committer: Alexandra <alexandra@networkingdom.net>
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly. Run the
following command and follow the instructions in your editor to edit
your configuration file:
git config –global –edit
After doing this, you may fix the identity used for this commit with:
git commit –amend –reset-author
2 files changed, 2 insertions(+)
create mode 100644 forgot_this_file

Unstaging a change

If you make a change, you stage it (tell GIT to take into account that change and track it), but then you figure out it’s not what you meant and you want GIT to have the change unstaged (not taken into account). Doing it it’s easy but be careful with the options.

Let’s change the contents of the recently created file forgot_this_file:

$ vi forgot_this_file
$ cat forgot_this_file
Nice try!

$ git status
On branch master
Changes not staged for commit:
(use “git add <file>…” to update what will be committed)
(use “git checkout — <file>…” to discard changes in working directory)
modified:   forgot_this_file
no changes added to commit (use “git add” and/or “git commit -a”)

We then add the file (stage it):

$ git add *
$ git status
On branch master
Changes to be committed:
(use “git reset HEAD <file>…” to unstage)
modified:   forgot_this_file

The nice thing is that GIT actually tells you what to do if you’ve changed your mind :):

$ git reset HEAD forgot_this_file
Unstaged changes after reset:
M   forgot_this_file

M stands for “modified”, and we have our changes unstaged:

$ git status
On branch master
Changes not staged for commit:
(use “git add <file>…” to update what will be committed)
(use “git checkout — <file>…” to discard changes in working directory)
modified:   forgot_this_file
no changes added to commit (use “git add” and/or “git commit -a”)

Also, if you want to discard the changes made to the file, you can do this with the git checkout command:

$ git checkout — forgot_this_file
$ git status
On branch master
nothing to commit, working directory clean

Delete GIT’s memory

More complicated, if you want to completely remove a file from the GIT repo and all history, you have a few ways of doing it (I would recommend just to be careful what you’re committing and you’ll be safer 🙂

If no commit has happened after the commit in which you added the bad file, it’s easier to remove the file completely

Remove the file:
$ git rm forgot_this_file
rm ‘forgot_this_file’
$ git status
On branch master
Changes to be committed:
(use “git reset HEAD <file>…” to unstage)
deleted:    forgot_this_file

And amend the commit:

$ git commit –amend
[master 22e1cd4] Added content to the file1 file
Date: Tue Jan 12 23:50:11 2016 +0000
Committer: Alexandra <alexandra@networkingdom.net>
Your name and email address were configured automatically based
on your username and hostname. Please check that they are accurate.
You can suppress this message by setting them explicitly. Run the
following command and follow the instructions in your editor to edit
your configuration file:
git config –global –edit
After doing this, you may fix the identity used for this commit with:
git commit –amend –reset-author
1 file changed, 2 insertions(+)

File is gone from all GIT history:

$ git log -p
commit 22e1cd4793ce56fbcda7cc58d791d5fcdc00722e
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:50:11 2016 +0000
Added content to the file1 file
diff –git a/file1 b/file1
index 980a0d5..4246eab 100644
— a/file1
+++ b/file1
@@ -1 +1,3 @@
Hello World!
+
+This is not GIT

commit 3b524c0a4db6061460eeebe1c06de8d74ec8a3b3
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:10:31 2016 +0000
This is the initial commit
diff –git a/file1 b/file1
new file mode 100644
index 0000000..980a0d5
— /dev/null
+++ b/file1
@@ -0,0 +1 @@
+Hello World!

If you have commits, we’ll see how it’s done when we get to branches.

 

A few more things explained…

A git repository contains, among other things, the following:

  • A set of commit objects.
  • A set of references to commit objects, called heads.

The GIT repository is stored in the same directory as the project itself, in a subdirectory called .git. There is only one .git directory, in the root directory of the project.

A commit object contains three things:

  • A set of files, reflecting the state of a project at a given point in time.
  • References to parent commit objects.
  • An SHA1 name, a 40-character string that uniquely identifies the commit object. The name is composed of a hash of relevant aspects of the commit, so identical commits will always have the same name.

The parent commit objects are those commits that were edited to produce the subsequent state of the project. Generally, a commit object will have one parent commit, because one generally takes a project in a given state, makes a few changes, and saves the new state of the project. The section below on merges explains how a commit object could have two or more parents.

A project always has one commit object with no parents and this is the first commit made to the project repository.

Heads

A head is simply a reference to a commit object, identified through a name. By default, there is a head in every repository called master. A repository can contain any number of heads.

At any given time, one head is selected as the “current head.” This head is aliased to HEAD, always in capitals.

There is a very big difference between a “head” (lowercase), which may refer to any one of the named heads in the repository, and “HEAD” (uppercase) which refers exclusively to the currently active one.

Accessing Commits

Now that you’ve created commits, how do you refer to a specific commit? GIT provides many ways to do so. Here are a few:

  • By its SHA1 name, which you can get from git log.
  • By the first few characters of its SHA1 name.
  • By a head. For example, HEAD refers to the commit object referenced by HEAD. You can also use the name, such as master.
  • Relative to a commit. Putting a caret (^) after a commit name retrieves the parent of that commit. For example, HEAD^ is the parent of the current head commit.

Let’s say we want to revert everything to a previous commit. Now, we only have 2 commits, the initial one (A) and a subsequent one (B). We need to load everything back to (A)

$ git status
On branch master
nothing to commit, working directory clean

$ git log
commit 22e1cd4793ce56fbcda7cc58d791d5fcdc00722e
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:50:11 2016 +0000
Added content to the file1 file

commit 3b524c0a4db6061460eeebe1c06de8d74ec8a3b3
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:10:31 2016 +0000
This is the initial commit

$ git checkout 3b524c0a4db6061460eeebe1c06de8d74ec8a3b3
Note: checking out ‘3b524c0a4db6061460eeebe1c06de8d74ec8a3b3’.
You are in ‘detached HEAD’ state. You can look around, make experimental changes and commit them, and you can discard any commits you make in this state without impacting any branches by performing another checkout. If you want to create a new branch to retain commits you create, you may do so (now or later) by using -b with the checkout command again.

Example:
git checkout -b <new-branch-name>
HEAD is now at 3b524c0… This is the initial commit

$ git status
HEAD detached at 3b524c0
nothing to commit, working directory clean

$ git log
commit 3b524c0a4db6061460eeebe1c06de8d74ec8a3b3
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:10:31 2016 +0000
This is the initial commit

Another way of doing this is with git reset –hard which is a command that’s totally not recommended:

Let’s prepare this:

$ touch A
$ git add *
$ git commit -m “added A”
$ touch B
$ git add *
$ git commit -m “added B”
$ touch C
$ git add *
$ git commit -m “added C”

We created 3 commits:

$ git log
commit 34b2540d3a64a29901f81ec44892fa62d04b0b72
Author: Alexandra <alexandra@networkingdom.net>
Date:   Wed Jan 13 10:23:05 2016 +0000
added C

commit 4a8e184e08b08de63cc0b857c53472ad7d459e9e
Author: Alexandra <alexandra@networkingdom.net>
Date:   Wed Jan 13 10:22:49 2016 +0000
added B

commit eeab9d5dac499eb5ecd55898da961ffce11852c9
Author: Alexandra <alexandra@networkingdom.net>
Date:   Wed Jan 13 10:21:48 2016 +0000
added A

commit 3b524c0a4db6061460eeebe1c06de8d74ec8a3b3
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:10:31 2016 +0000
This is the initial commit

Now we need to completely revert to the commit in which B was added:

$ git reset –hard 4a8e184e08b08de63cc0b857c53472ad7d459e9e

$ git log
commit 4a8e184e08b08de63cc0b857c53472ad7d459e9e
Author: Alexandra <alexandra@networkingdom.net>
Date:   Wed Jan 13 10:22:49 2016 +0000
added B

commit eeab9d5dac499eb5ecd55898da961ffce11852c9
Author: Alexandra <alexandra@networkingdom.net>
Date:   Wed Jan 13 10:21:48 2016 +0000
added A

commit 3b524c0a4db6061460eeebe1c06de8d74ec8a3b3
Author: Alexandra <alexandra@networkingdom.net>
Date:   Tue Jan 12 23:10:31 2016 +0000
This is the initial commit

 

In the next post, we’ll discuss about branching, forking, rebasing and all sorts of other fun stuff!

Cheers!

Facebook Comments
Rating

Leave a Reply