A git tutorial (kinda) for absolute beginners
- What is Git
- Some important nomenclature
- How to get started
- Creating a repository
- Seeing what Git sees
- Adding files to the repository
- Committing the index to the repository
- Viewing the commit history
- Viewing changes in the workspace
- Adding changes to the repository
- Creating a new branch
- Switching branches
- Listing available branches
- Undoing commits
- Merging branches
- Dealing with merge conflicts
- Deleting branches
- Adding remote repositories
- Creating a bare remote repository
- Pushing your changes to the remote repository
- Updating remote repository information
- Cloning a remote repository
- Pulling remote branches
- Using GitHub or GitLab as a remote repository
- Some tips
From https://git-scm.com:
Git is a free and open source distributed version control system designed to handle everything from small to very large projects with speed and efficiency.
A version control system is a way of keeping track of the changes you do to your code (or other files). It allows you to easily undo changes, mark and compare different versions/snapshots of the code, and to collaborate with other people on the same code base. Changes to the code (and the history of all changes) are recorded in so called "code repositories". They can be imagined as very detailed log books. Every change that to the code that is committed to the repository is recorded in the log book.
Git is not GitHub. Git is not GitLab. GitHub and GitLab are websites that offer to host Git repositories as a service. Using them has certain advantages, but it is not at all required.
- Code:
- In this document "code" refers to all files you want to keep track of. This will mostly be text files (like actual source code), but can also include binary files (like images).
- Repository:
- The place where the history of code changes is stored. Can be imagined as a detailed logbook.
- Commit:
- A set of changes to the repository. Can be imagined as an entry in the logbook. Every commit (except for the very first) builds up on previous commits. It stores only the differences between the previous state/version of the code and the updated one. Since all past changes are kept track of, it is possible to recover the state of the code at any point in the recorded history, i.e. at any commit.
- Workspace:
- The folder(s) where your code lives "outside" the repository, i.e. the files you see and work on in your directory structure.
- Index:
- A temporary staging area where you mark changes in your workspace to be committed to the repository.
- Branch:
- A branch is a reference to a particular state of the code, i.e. a particular commit. A Repository can have many branches and you can switch between them to work on/test different versions of your code. Think of them as bookmarks in the logbook.
- HEAD:
- The HEAD is an alias for the branch you are currently working on (more or less). If you create a new commit, it will be based on the current HEAD.
Start a shell on your (Linux) machine and run git
:
usage: git [--version] [--help] [-C <path>] [-c <name>=<value>] [--exec-path[=<path>]] [--html-path] [--man-path] [--info-path] [-p | --paginate | -P | --no-pager] [--no-replace-objects] [--bare] [--git-dir=<path>] [--work-tree=<path>] [--namespace=<name>] <command> [<args>] These are common Git commands used in various situations: start a working area (see also: git help tutorial) clone Clone a repository into a new directory init Create an empty Git repository or reinitialize an existing one work on the current change (see also: git help everyday) add Add file contents to the index mv Move or rename a file, a directory, or a symlink reset Reset current HEAD to the specified state rm Remove files from the working tree and from the index examine the history and state (see also: git help revisions) bisect Use binary search to find the commit that introduced a bug grep Print lines matching a pattern log Show commit logs show Show various types of objects status Show the working tree status grow, mark and tweak your common history branch List, create, or delete branches checkout Switch branches or restore working tree files commit Record changes to the repository diff Show changes between commits, commit and working tree, etc merge Join two or more development histories together rebase Reapply commits on top of another base tip tag Create, list, delete or verify a tag object signed with GPG collaborate (see also: git help workflows) fetch Download objects and refs from another repository pull Fetch from and integrate with another repository or a local branch push Update remote refs along with associated objects 'git help -a' and 'git help -g' list available subcommands and some concept guides. See 'git help <command>' or 'git help <concept>' to read about a specific subcommand or concept.
The exact output of this (and any other git command) might be different from the examples in this document, depending on the exact version of Git you are using.
If you get a "command not found" error, you need to install Git first. It is part of the package manager of about every single Linux distribution.
Git is a very powerful tool, which unfortunately also means that it has many
intimidating looking options. The basic usage is fairly simple though, so don't
let it scare you. The git help
command, the documentation, and your favourite
search engine are your friends, if you should ever not know what to do. Even
"experts" regularly look up how to do certain things, so don't feel bad about
doing so yourself.
If you have never used Git on this machine before, you will have to tell it who you are. All commits also record the author of the changes. You can set this information by executing:
$ git config --global user.name "John Doe" $ git config --global user.email [email protected]
Just replace the name and address with something appropriate. What exactly you put here usually does not matter. The name and e-mail address are simply written into the commit without doing anything else with them. If you use a service like GitHub though, it might compare the e-mail address with its record of users, so it can link to the appropriate user for all commits. This is purely for convenience though, and everything usually also works authors that are completely unknown to GitHub.
If you have some code you want to start keeping track of, you need to first create a repository. Go to the base directory of the code and run:
$ git init Initialized empty Git repository in /home/koch/test/.git/
This will set up a repository in the hidden folder .git
in the same
directory. All commits and supplementary information will be stored in that
folder. If you lose it, you also lose the repository. To use Git as a way of
creating backups of your code, you will need a separate remote repository (more
on that later).
Internally Git is using relative paths to refer to files within the workspace. So from Git's point of view it is safe to move/rename the base directory of the code, should you ever need/want to.
To get a short summary of the current state of your working directory, you can
use the status
command:
$ git status On branch master No commits yet Untracked files: (use "git add <file>..." to include in what will be committed) some_file.txt nothing added to commit but untracked files present (use "git add" to track)
Among other things it will tell you what branch you are working on right now. In this case, that is the default branch "master".
On its own, Git will not magically start tracking the files in your workspace.
By default files in the working directory will be "untracked". You tell Git to
add them to the repository using the add
command. Afterwards you can check
whether it did what you expected with status
:
$ git add some_file.txt $ git status On branch master No commits yet Changes to be committed: (use "git rm --cached <file>..." to unstage) new file: some_file.txt
This did not actually add the file to the repository yet, but it added it to the index, i.e. the staging area. By separating the "adding" from the "committing" step, Git allows you to sequentially add multiple files and then commit them all in one single commit.
When you are happy with the changes you have added to the index, you can commit them
to repository using the commit
command:
$ git commit [master (root-commit) f45e476] Add some file. 1 file changed, 1 insertion(+) create mode 100644 some_file.txt
This will open your default text editor to write a commit message. A commit message should begin of a single line with a short description what the commit does. Conventionally this line should be written in the imperative case, e.g.
Add some file.
and not
Adds some file.
The short summary should be followed by a more detailed description of what changes happened in the commit. There should be a blank line separating the short summary from the rest, e.g.:
Add some file. This file is very important. It makes everything go. Without it, nothing works.
Make sure to keep the short summary as concise as possible. It will be shown in several places when looking at the commit history. Long lines can reduce the readability of such overviews. Be as verbose as you want in the following longer paragraphs.
If you have a simple commit that does not require a detailed explanation, you
can use the -m
option so specify a short commit message directly in the
command line:
$ git commit -m 'Add some file.'
You can change your default editor by setting the VISUAL
and EDITOR
environment variables (probably in your .bashrc
). If you want to change
only the editor git uses but leave the system default alone, you can configure
it like this:
git config --global core.editor "vim"
You can use the log
command to view the history of commits in your
repository:
$ git log commit f45e476d0f9d35b571d51ea455f030ac00ca252a (HEAD -> master) Author: Lukas Koch <[email protected]> Date: Mon Mar 2 16:10:28 2020 +0000 Add some file. A longer description goes here.
By default this will show you a list of commits with the full commit message as
well as additional information like the author and the time of each commit.
This is not ideal if one just wants a quick overview of what the commit history
looks like. For this you can modify the output format of the log
command
using a few of its many options:
$ git log --oneline --graph --date-order --decorate * f45e476 (HEAD -> master) Add some file.
Because it is a bit of a pain to type such a long command, it might be useful to define a bash alias for it:
$ alias gitl='git log --oneline --graph --date-order --decorate'
After running this (or adding it to your bashrc
) you will be able to use the
gitl
shortcut:
$ gitl * f45e476 (HEAD -> master) Add some file.
To show what changes a commit actually contains, use the show
command and
the commit's hash value:
$ git show f45e476
Another way of looking at the commit history is to use graphical interfaces
like gitk
. These have to be installed separately from the core Git program
though.
Once a file is tracked by the repository, the status
command will tell you
when it has been changed:
$ git status On branch master Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git checkout -- <file>..." to discard changes in working directory) modified: some_file.txt no changes added to commit (use "git add" and/or "git commit -a")
You can use the diff
command to see the actual changes line-by-line:
$ git diff diff --git a/some_file.txt b/some_file.txt index 7b57bd2..7b7ddac 100644 --- a/some_file.txt +++ b/some_file.txt @@ -1 +1,3 @@ -some text +Some text. + +Some more text.
Lines beginning with a -
are present in the HEAD, but not in the workspace.
Lines beginning with a +
are present in the workspace, but not in the HEAD.
These changes are not automatically added to the repository. If you want to record some changes, you need to explicitly add them to the index and then commit the index to the repository:
$ git add some_file.txt $ git status On branch master Changes to be committed: (use "git reset HEAD <file>..." to unstage) modified: some_file.txt $ git commit -m 'Change some file.' [master df598a5] Change some file. 1 file changed, 3 insertions(+), 1 deletion(-)
Sometimes one does a couple of changes to a file, but then only want to commit
part of them. For example, if you find (and fix) a bug or typo while working on
a bigger change, you might want to commit just the bug fix and then keep
working on the bigger change before committing that one separately. This can be
achieved with the -p
option of add
:
$ git add -p diff --git a/some_file.txt b/some_file.txt index 7b7ddac..72c34d9 100644 --- a/some_file.txt +++ b/some_file.txt @@ -1,3 +1,5 @@ -Some text. +Some text! Some more text. + +Even more text. Stage this hunk [y,n,q,a,d,s,e,?]?
It will go through all changes it sees compared to the HEAD and ask you whether
you want to add it to the index or not. You also have the option to split the
shown changes (called "hunk") into even smaller pieces (s
), or to edit it
freely (e
). If you forget what all the different letters actually mean, you
can get an explanation by choosing ?
.
After you added a part the changes in a file to the index, it will appear twice
in the status
command, both as modified and ready to commit and as having
changes that are not staged yet:
$ git status On branch master Changes to be committed: (use "git reset HEAD <file>..." to unstage) modified: some_file.txt Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git checkout -- <file>..." to discard changes in working directory) modified: some_file.txt
If you want to see the differences between the index and the HEAD before
committing, them you can use the --cached
option of the diff
command:
$ git diff --cached diff --git a/some_file.txt b/some_file.txt index 7b7ddac..3288880 100644 --- a/some_file.txt +++ b/some_file.txt @@ -1,3 +1,3 @@ -Some text. +Some text! Some more text.
You can view the changes that have not been added to the index yet with the
diff
command without additional options:
$ git diff diff --git a/some_file.txt b/some_file.txt index 3288880..72c34d9 100644 --- a/some_file.txt +++ b/some_file.txt @@ -1,3 +1,5 @@ Some text! Some more text. + +Even more text.
When you are happy with the changes added to the index, you commit them as
usual with commit
:
$ git commit -m 'Exclaim!' [master c2a70dd] Exclaim! 1 file changed, 1 insertion(+), 1 deletion(-)
In general you should strife to "commit early, commit often". Commits are cheap, so whenever you have a tiny change of code that you know you probably want to keep (for now), commit it. Even if it is experimental code! Accruing lots of changes in the workspace that you never commit makes it quite hard to find the changes that you do want to keep later on.
If you are working on a experimental feature and do not quite know whether you will want to keep the changes you are doing to the code, it is a good idea to create a separate branch for it. This will allow you to work on it and do lots of commits without having to worry about how to undo these changes later if it does not turn out as hoped.
You can create a new branch and immediately switch to it using the checkout
command:
$ git checkout -b 'experimental' Switched to a new branch 'experimental'
If you want to know which branch you are working on, use status
:
$ git status On branch experimental nothing to commit, working tree clean
You can now use all previous commands as before, but without affecting the state of the "master" branch.
You can switch between branches using the checkout
command:
$ git checkout master Switched to branch 'master'
If you try to checkout a different branch while you have uncommitted changes in one of your tracked files, Git fill refuse to do so, because it would mean losing those changes:
$ git checkout experimental error: Your local changes to the following files would be overwritten by checkout: some_file.txt Please commit your changes or stash them before you switch branches. Aborting
You can temporarily stash
these changes, which will create a temporary commit that stores
them safely while you work on the other branch:
$ git stash Saved working directory and index state WIP on master: c2a70dd Exclaim! $ git checkout experimental Switched to branch 'experimental'
When you are done, just switch back and reapply the stashed changes:
$ git checkout master Switched to branch 'master' $ git stash pop On branch master Changes not staged for commit: (use "git add <file>..." to update what will be committed) (use "git checkout -- <file>..." to discard changes in working directory) modified: some_file.txt no changes added to commit (use "git add" and/or "git commit -a") Dropped refs/stash@{0} (1aeefb23cc4ebe19ec7e3a6e1b8b40ccc47cf7d9)
You can list all branches in your repository with the branch
command:
$ git branch experimental * master
If you also want to list branches on remote repositories (see below), just add
the -r
or -a
option.
To get an overview of how the branches differ, i.e. what kind of commits are in
which branch, you can use the one-line log
command as defined above with the
--all
option:
$ gitl --all * d453b3e (experimental) Add something experimental. * 16c90da (HEAD -> master) Add even more text. * c2a70dd Exclaim! * 5c0364d Change some file. * f45e476 Add some file.
Here you can see that we are currently on the branch "master", (HEAD ->
master)
, and that the branch "experimental" is ahead of "master" by a commit
"Add something special".
If it turns out that a change you committed was not a good idea after all, you
can undo it with the revert
command together with a hash of the commit you
want to undo:
$ git revert c2a70dd [master 000add3] Revert "Exclaim!" 1 file changed, 1 insertion(+), 1 deletion(-)
This will create a new commit that undoes the changes of the specified one.
When you are done with the development on a branch and decide you want to add
these changes to the "master" branch, you can do so by merging them with
merge
:
$ git merge experimental Merge made by the 'recursive' strategy. experimental_textfile.txt | 1 + 1 file changed, 1 insertion(+) create mode 100644 experimental_textfile.txt
After this (and possibly dealing with merge conflicts), the changes introduced in the branch "experimental" will also be present in the current HEAD:
$ gitl * 60ff7f6 (HEAD -> master) Merge branch 'experimental' |\ * | 000add3 Revert "Exclaim!" | * d453b3e (experimental) Add something experimental. |/ * 16c90da Add even more text. * c2a70dd Exclaim! * 5c0364d Change some file. * f45e476 Add some file.
If a file has been changed in both the current branch and the branch that is going to be merged into it, this can lead to a merge conflict. Instead of guessing which version of the file to keep, Git will stop the merging process and ask you to deal with it:
$ git merge experimental Auto-merging some_file.txt CONFLICT (content): Merge conflict in some_file.txt Automatic merge failed; fix conflicts and then commit the result.
Git status
will also tell you which files cause problems:
$ git status On branch master Your branch is ahead of 'my_remote/master' by 1 commit. (use "git push" to publish your local commits) You have unmerged paths. (fix conflicts and run "git commit") (use "git merge --abort" to abort the merge) Unmerged paths: (use "git add <file>..." to mark resolution) both modified: some_file.txt
Inside the file, the conflicting sections are marked like this:
<<<<<<< HEAD Some text. Some more text, right? Even more text. ======= test >>>>>>> experimental
You have to edit the files to be as you would like it to look after the merge
and then add
and commit
them.
Since the changes in "experimental" have been merged into the "master" branch, the branch "experimental" can be safely deleted:
$ git branch --delete experimental Deleted branch experimental (was d453b3e).
This does not delete the commits that were part of "experimental", as those are also a part of "master" now. This only deletes the "bookmark" labeled "experimental":
$ gitl * 60ff7f6 (HEAD -> master) Merge branch 'experimental' |\ * | 000add3 Revert "Exclaim!" | * d453b3e Add something experimental. |/ * 16c90da Add even more text. * c2a70dd Exclaim! * 5c0364d Change some file. * f45e476 Add some file.
The local repository helps you to organise your code development, but it does not protect you against data loss because of disk failures or similar. For this you need to duplicate your repository somewhere else. Ideally on a different disk, in a different building, on a different continent.
There are multiple ways to do this, but they all involve adding a remote repository. This just tells git that the remote repository exists and enables you to refer to it by name in the future:
$ git remote add my_remote /path/to/remote/repository
You can list the remote repositories currently known to git like his:
$ git remote -v my_remote ../test_remote/ (fetch) my_remote ../test_remote/ (push)
Git also supports multiple protocols to access remote repositories -- well -- remotely. For example, if you have a repository on a remote machine with ssh access you can add it like this:
$ git remote add my_remote username@hostname:path/on/host
To add a remote repository, that repository must exist in the first place. To
avoid certain possible conflicts, it is a good idea (but not necessary) to use
"bare" repositories for remote backup. A "bare" repository does not have a
workspace. It consists only of the "logbook" that is usually hidden in the
.git
folder:
$ git init --bare Initialized empty Git repository in /home/koch/test_remote/ $ ls branches config description HEAD hooks info objects refs
So far you have only told git that the remote repository exists, but not what
to do with it. To tell git that it should use the remote repository to store
copies of all your local branches, use the push
command with the
--set-upstream
option:
$ git push my_remote --set-upstream --all Enumerating objects: 20, done. Counting objects: 100% (20/20), done. Delta compression using up to 8 threads Compressing objects: 100% (9/9), done. Writing objects: 100% (20/20), 1.70 KiB | 436.00 KiB/s, done. Total 20 (delta 2), reused 0 (delta 0) To ../test_remote/ * [new branch] master -> master Branch 'master' set up to track remote branch 'master' from 'my_remote'.
This will create a branch on the remote repository for every branch in you
local repository, and then push (i.e. copy or upload) your local branches to
them. The --set-upstream
option also configures your local branches to
remember what their remote counterparts are. So from this point onwards you
can simply run:
$ git push Everything up-to-date
to upload the current branch to its remote counterpart. Since we just did that, in this case it tells us that there is nothing to do.
The locally known state of the remote repository branches can also be viewed
with the log
command:
$ gitl --all * 60ff7f6 (HEAD -> master, my_remote/master) Merge branch 'experimental' |\ * | 000add3 Revert "Exclaim!" | * d453b3e Add something experimental. |/ * 16c90da Add even more text. * c2a70dd Exclaim! * 5c0364d Change some file. * f45e476 Add some file.
Git does not access the remote repository when showing you this information. It is simply what your local git repository last learned about the remote repository, e.g. when it last pushed a branch to it.
To update what the local repository knows about the remote repository, you can use
the fetch
command:
$ git fetch my_remote
To update all remotes at once, you can use the remote update
command:
$ git remote update --prune Fetching origin
When you specify the --prune
option it will also delete references to
remote branches that no longer exist.
When you want to get a local copy of a remote repository that already exists,
the easiest way to set it up is to clone
the remote repository:
$ git clone /path/to/remote/ test2 Cloning into 'test2'... done.
This will create a local repository and automatically set up a remote repository "origin" which points to the repository that was just cloned:
$ cd test2/ $ ls experimental_textfile.txt some_file.txt $ gitl * 60ff7f6 (HEAD -> master, origin/master, origin/HEAD) Merge branch 'experimental' |\ * | 000add3 Revert "Exclaim!" | * d453b3e Add something experimental. |/ * 16c90da Add even more text. * c2a70dd Exclaim! * 5c0364d Change some file. * f45e476 Add some file. $ git remote -v origin /home/koch/test_remote/ (fetch) origin /home/koch/test_remote/ (push)
If you do not specify a target path for the repository, it will create a new folder named after the repository you are trying to clone.
Sometimes you know that you want to merge whatever changes have been made on
the remote repository into your local branch (e.g. if you just want to update
your local code with the newest changes someone else has made). In this case
you can use the pull
command:
$ git pull Already up to date.
It is basically just a shortcut to do a fetch
and then merge
.
Repositories hosted on GitHub or GitLab work just like any other remote repository. If you just want to clone/fetch/pull from someone else's repository on there, you do not even need an account on these sites. All repositories have a https URL, which just so happens to be one of the remote protocols that git understands. You can find these URLs by clicking on the "clone" button on the repositories main page. Then you can clone it like usual:
$ git clone https://github.com/ast0815/git-tutorial.git
You will not be able to push any of your own changes to these repositories though. To be able to push your own changes to a repository on these websites, you need an account and create your own repositories there.
Commit early, commit often
Push regularly
Use aliases to save some time typing commands:
alias gita='git add' alias gitc='git commit' alias gitd='git diff' alias gitk='gitk --date-order' alias gitl='git log --oneline --graph --date-order --decorate' alias gitp='git push' alias gitr='git remote update -p' alias gits='git status'
Never give up, never surrender