Git Tutorial

Git is a version control system that you is also used for collaborative software development.

Git was originally authored by Linus Torvalds (who also wrote the Linux kernel when he was in his undergrad at the University of Helsinki).

Terminology

A pointer points to some address is memory. A pointer is made up of:

  1. a name

  2. a location

  3. a type

A commit object contains a pointer to the snapshot of the content you staged. A commit also contains the

  1. author's name

  2. email address

  3. the commit message

  4. pointers to the commit or commits that directly came before this commit (its parent or parents)

Git Sections

Everytime you commit, you save the state of your project Git stores a reference to the snapshot.

Git Sections

Every commit in Git has an associated checksum. Checksums are used to ensure the integrity of your snapshot after it has been commited to the Git internal filesystem.

The checksum is calculated using a hash function, which means the hash is a function of the contents of the files and hence your commit.

Checksums are used to ensure that the file has not been tampered with. If a commit has been altered, it will produce a different checksum

It's impossible to change the contents of any file or directory without Git knowing about it.

Example Hash
24b9da6552252987aa493b52f8696cd6d3b00373

A branch in Git is simply a lightweight movable pointer to one of these commits as defined above.

Git keeps a special pointer called HEAD to indicate which branch you're currently on.

Git Sections

The default branch name in Git is master.

As you start making commits, you're given a master branch that points to the last commit you made.

Every time you commit, the master branch pointer moves forward automatically.

Git Initial Set-up

Once git is installed, you should set up your identity, which will be used to sign every single commit you make:

$ git config --global user.name "Bryan Paget"
$ git config --global user.email bryanpaget@pm.me

If you care about your text editor, please set it now:

$ git config --global core.editor emacs

You can check your settings:

$ git config --list
$ git config user.name

If you need help:

$ git help <verb>
$ git <verb> --help
$ git <verb> -h  # quick reference, not the manpage
$ man git-<verb>

To initialize a new git repository:

$ git init

Initialized empty Git repository in /var/home/bryan/Projects/Tutorials/

The above command will create the .git directory, which stores the snapshot information along with the reference/pointer/branch information.

If git status is too vague and you want to know exactly what was changed, not just which files were changed, you can use the git diff command.

$ git status
$ git diff

That command git diff compares what is in your working directory with what is in your staging area. The result tells you the changes you've made that you haven't yet staged.

If you want to see what you've staged that will go into your next commit, you can use git diff --staged. This command compares your staged changes to your last commit:

$ git diff --staged

Staged

git add <file> is a multipurpose command, use it to

  1. begin tracking new files,

  2. to stage files

  3. to do other things like marking merge-conflicted files as resolved.

It may be helpful to think of it more as "add precisely this content to the next commit" rather than "add this file to the project".

$ git add README.md
$ git status
$ git add hello.txt

If you stage a file, then change it, you will need to stage it again. Otherwise git will only stage the first set of changes.

$ git status

There is also an short version of git status.

$ git status -s

New files that aren't tracked have a ?? next to them, new files that have been added to the staging area have an A, modified files have an M and so on.

Committing

Now that your staging area is set up the way you want it, you can commit your changes.

$ git commit

The above command will launch your preferred text editor where you will be able to write a commit message for yourself and others who may be contributing to your project. The point is to remind yourself what you did, so your commit message should be meaningful.

$ git commit -m "I can't remember what I did."

When you pass the -m option to git commit, you bypass the text editor and the string following -m is taken as your commit message.

As the message indicates, it can be easy to forget what you did. Git has a useful feature for jogging your memory:

$ git commit -v

When you pass the -v option to git commit, Git puts the diff of your change in the editor so you can see exactly what changes you're committing.

NB: Remember that the commit records the snapshot you set up in your staging area. Anything you didn't stage is still sitting there modified; you can do another commit to add it to your history. Every time you perform a commit, you're recording a snapshot of your project that you can revert to or compare to later.

.gitignore

You can store regular expressions in a file named .gitignore to match files you never want git to track.

Each project can have its own .gitignore file placed in the projects root directory.

You can also create a global ignore file named .gitignore_global and place it in the root of your home directory. There are many templates on the web, including some very thorough examples for specific programming languages on github.com.

$ cat .gitignore
*.[oa]
*~

The rules for the regular expression patterns in the .gitignore file are as follows:

The following is an example .gitignore file:

# ignore all .a files
*.a
# but do track lib.a, even though you're ignoring .a files above
!lib.a
# only ignore the TODO file in the current directory, not subdir/TODO
/TODO
# ignore all files in any directory named build
build/
# ignore doc/notes.txt, but not doc/server/arch.txt
doc/*.txt
# ignore all .pdf files in the doc/ directory and any of its subdirectories
doc/**/*.pdf

NOTE: you can also place .gitignore files in subdirectories of a project and they will take effect on that directory and all of its subdirectories.

Removing files from Git

To remove a file from Git, you have to remove it from your tracked files (more accurately, remove it from your staging area) and then commit. The git rm command does that, and also removes the file from your working directory so you don't see it as an untracked file the next time around.

If you simply remove the file from your working directory, it shows up under the "Changes not staged for commit" (that is, unstaged) area of your git status output:

$ git rm README.md

But then you still have to commit the change to remove the file.

Cloning

You can clone your own or someone else's work from a remote server using git's clone command. When you clone a repository you obtain a full copy of the project and its history.

$ git clone <example>

Since we are interested in face-detection, let's checkout the faces branch and work on it.

$ git checkout <branch>

Branching

When you make a commit, Git stores a commit object that contains a pointer to the snapshot of the content you staged.

Git Sections

When you create a commit by running git commit, Git adds a checksum to each subdirectory and stores them as a tree object in the Git repository. Git then creates a commit object that has the metadata and a pointer to the root project tree so it can re-create that snapshot when needed.

$ git branch testing
Git Sections
$ git checkout testing
Git Sections
$ echo "Hi" > README.md
$ git add README.md
$ git commit -m "Added README.md to testing branch."
Git Sections
$ git checkout master
Git Sections
$ echo "Hello" > README.md
$ git add README.md
$ git commit -m "Added README.md to master branch."
Git Sections

Git log

$ git log
$ git log --graph

One of the more helpful options is -p or --patch, which shows the difference (the patch output) introduced in each commit. You can also limit the number of log entries displayed, such as using -2 to show only the last two entries.

$ git log -p -2
$ git log --pretty=oneline
$ git log --pretty=format:"%h - %an, %ar : %s"

Since Date

$ git log --since=2.weeks

Pickaxe

Another really helpful filter is the -S option (colloquially referred to as Git's "pickaxe" option), which takes a string and shows only those commits that changed the number of occurrences of that string. For instance, if you wanted to find the last commit that added or removed a reference to a specific function, you could call:

$ git log -S function_name

The last really useful option to pass to git log as a filter is a path. If you specify a directory or file name, you can limit the log output to commits that introduced a change to those files. This is always the last option and is generally preceded by double dashes (--) to separate the paths from the options.

Git Log Options

Option Description
-<n> Show only the last n commits
--since, --after Limit the commits to those made after the specified date.
--until, --before Limit the commits to those made before the specified date.
--author Only show commits in which the author entry matches the specified string.
--committer Only show commits in which the committer entry matches the specified string.
--grep Only show commits with a commit message containing the string
-S Only show commits adding or removing code matching the string
$ git log --grep faces

Additional Notes:

If you want to preview a merge, because you suspect something bad will happen, you can create a temporary branch based off the target branch and then merge the new work into that temporary branch:

# Let the target branch be master

$ git checkout master
$ git checkout -b new-temporary-branch
$ git merge some-new-branch
$ git branch -D new-temporary-branch

From there you can see if there are any conflicts and if there are none you can merge into the target branch.

When you need to start over

The git reset --hard command wipes all staged and uncommitted changes so you can start again.

git reset --hard

Git Patch Mode

With git add -p (patch mode) you can pick what parts of the file you changed to stage, instead of the whole file:

$ nvim main.py     # add two hunks, not contiguous
$ git add --patch  # git asks us which hunks to stage
$ git diff         # we can see what still remains unstaged

Rename Git Branches

You can rename a branch with:

$ git branch -m old-name new-name

Detached Head

When you checkout a commit that does not have a branch, git will say HEAD is in a detached state.

$ git checkout <some commit>
Essential Reading

Gitlab Basics

Gitlab Flow

Gitlab Merge Requests

Gitlab Flow Video

More Advanced

Merging and Rebasing

README.md

Markdown Reference