The philosophy behind git is independent distributed development of a joint project. It is well suited for small distributed projects with only a few files such as latex source. This tutorial describes the essential steps in contributing to such a project. Apart from this there are many tutorials on git and almost all potential cases to be found in the web. A recommended resource is http://git-scm.com.
.git
of the top working directory;
The general strategy to introduce a change to the project is to fetch new commits from the remote repository to the local repository (fetch) and then apply the newly fetched commits to the working directory (merge or rebase). Now the work can be done in the working directory. The work is then committed to the local repository (commit) and pushed to the remote repository (push).
When several collaborators work on a common state, the history will inevitably diverge creating a branched tree of commits. Git can handle such branched trees in the repositories. However, when pushing the local to the remote repository, their structures must be compatible: a branch can only be extended when the head of the remote branch lies in the history of the head of the local branch. Otherwise a push will fail calling for updates on the local repository: branches can be combined (merge), reorganized (rebase), or removed. Such an update shall make the local structure compatible to the remote and enable the push.
The storage of updates and in particular the automatic and manual conflict resolution is based on diff. Diff is a line-based tool, and therefore the structure of changes is most apparent if
This is the most natural outcome for a classical text editor without automatic line breaking. A latex editor / environment should be configured not to restructure text blocks to fit a given width. Ideally, the lines are broken manually at a full stop, punctuation (equality, relation) or somewhere in the middle of long sentences (equations).
Single commits should be atomic rather than collective:
It makes sense to commit frequently (after each atomic step) to the local repository. After some piece of work is completed, the commits can be pushed collectively to the remote repository.
Compared to simple file-synchronization tools and services (e.g. dropbox), git has multiple advantages. The most important ones are:
git is a set of command line tools. There are many useful gui extensions to simplify the workflow.
apt-get install git gitgui meld gitk gitg giggleand the extension rabbitvcs for the gnome file manager by
apt-get install rabbitvcs-nautilus rabbitvcs-gedit
You should set up your contact details to identify yourself as the author of your commits
git config [--global] user.name "First Last" git config [--global] user.email "em@i.l"
You can do this globally or on a project basis.
We start with a simple example demonstrating a typical workflow step by step. This example can be performed at the command line in a Linux environment. Most of the below operations are also available in the graphical user interfaces.
Say you want to work on a project that already has a remote repository user@server.domain:repository. First, create a local copy of the repository by
git clone user@server.domain:repository
This will create a subdirectory named repository in the current working directory that contains all the files belonging to the project.
When you want to start working on your files, first do a
git pull
which will download all new commits from the remote to your local repository, and update your local files (evidently, this operation is unnecessary right after the initial download, but usually it is the first step when you start working). Say the project contains a latex file paper.tex that you want to edit. Enter the directory and do your editing. Now a
git status
will show some information about the repository, in particular it will say
Changes not staged for commit: ... modified: paper.tex
This means that the working copy of the file has changed, but the changes have not yet been saved to the repository. After you finish your editing, do
git add paper.tex
The command marks the file paper.tex as being ready for submission to your
local repository. Another git status
now shows
Changes to be committed: ... modified: paper.tex
You can finally save your changes with
git commit -m "Added proof of theorem X"
including a useful summary of what you did. The current status of your
project has now been saved in a new commit.
You could also skip the explicit git add
step by using
git commit -a -m "..."
,
which will automatically commit all changed files.
You can see the recent
history, including your new commit, with
git log --oneline --graph -5
This shows the previous five commits.
After you are done editing and have saved your work in (possibly multiple) commits, you can make your changes available to your collaborators by
git push
This will upload your new commits to the remote repository, and will update the remote files accordingly.
When you want to continue working on your files later on, again do a
git pull
to update your local files. They will now include
all changes that have been made by your collaborators in the meantime.
Repeat the edit-add-commit cycle, and push when you are done.
You can also use git pull
as often as you like to inspect the changes
of the others even if you are not planning to edit the project yourself.
If someone else uploads changes to the remote repository while you
edit, your git push
will fail with an error.
Just do a git pull
to automatically combine all
changes made by others with all changes made by you.
A subsequent git push
should now succeed.
If an automatic merge is not possible
because two authors have simultaneously changed the
same part of the same file,
git will run into a conflict. Don't panic.
Conflicts are usually easy to resolve, see
resolving conflicts
below.
In case you forget to do git pull
before you edit, don't worry. Unless two authors have
simultaneously changed the same part of the same file,
git pull
will do the job of merging the changes. You can run git pull
whenever your local repository is in a clean state, i.e. when there
are no changes that have not been committed.
There are many ways to fill the repository with concurrent updates. The amount of work and the final result is more or less the same for all methods. However, the way the tree of commits builds up depends on the order and type of actions taken. Depending on the situation, different types of trees will simplify the collaboration and the resolution of issues. Several models are discussed in the following. One should agree on one mode and stick to it as far as possible. It is not necessary to understand all the other modes.
This model can be viewed as the default mode of operation.
Here the tree is branched and merged with only a single remote branch (commonly: master
).
Everyone applies one's new commits onto one's local branch head
and uses merge if necessary to unite with the remote branch head.
E--F (master) merge E--F--G (master) push E--F--G (master, / ===> / / ===> / / origin/master) A--B--C--D (origin/master) A--B--C--D (origin/master) A--B--C--D
Normally, one would pull to the latest update and then inspect it in the working directory.
In the case of changes to the working directory (not in clean state) pull will not work.
Then you may use git stash save
(before pull)
and git stash pop
(after pull)
to save the changes to the working directory across the update.
Stash save and pop is similar to a merge
of the working directory with the new branch head.
This model is similar to the above one, but it produces a different history of commits which may improve the tracking of changes: Here there is only a single remote branch which has a completely linear history. Everyone applies one's new commits onto the local branch head and uses rebase if necessary to place them on top of the remote branch head.
E--F (master) rebase E'-F' (master) push E'-F' (master, / ===> / ===> / origin/master) A--B--C--D (origin/master) A--B--C--D (origin/master) A--B--C--D
git config branch.master.rebase true
.
Normally, one would fetch and rebase to the latest update and then inspect it in the working directory
In the case of changes to the working directory (not in clean state) rebase will not work.
Then you may use git stash save
(before rebase)
and git stash pop
(after rebase)
to save the changes to the working directory across the update.
Stash save and pop is similar to a rebase
of the working directory onto the new branch head.
Here the tree is branched and merged with a single branch head for each contributor. Everyone applies one's new commits onto one's own branch head. Merge is used to import the changes of the other developers.
There are two useful methods:
In both cases git fetch --all
will show updates to the various remote branches.
Here there is a central main branch of development (master
),
plus (temporary) side branches for every development. One
creates a temporary branch to work on. This branch can also be
pushed without creating conflicts. At a reasonable stage, the
temporary branch can be merged back into the master branch.
E--F--G M--N--O--P--Q--R--S--T (master, origin/master) / \ / / \ A--B--C---D--H--I--J--K--L U--V--W--X (devel)
There are two useful methods:
To get started, go to the parent directory of the intended working directory, then:
git clone user@server.domain:repository [-b branch] [target-directory]
where:
ssh user@server.domain
lists available repositories.master
; only needed if one wants to work on a particular branch.Note: different formats for specifying the remote repository are in use depending on the type of remote service; here we mainly refer to the system gitolite which can be set up easily for non-commercial purposes.
To save the changes in the working directory to the local repository:
git commit [-a] [-m "message"] [file(s)] [--amend]
git add
(see below),
specify particular file(s) or use -a
for all modified files.
-m
or use the editor which is spawned when the message is omitted.
--amend
to add the changes to the previous commit.
The remote server will not be contacted during the above. These changes cannot be seen by others, yet. Therefore, they could still be edited or undone (with some effort, but without penalty).
The files of the working directory need to be staged before committing:
git add file(s)
Alternatively all modified (and previously staged) files can be
committed using the -a
option for commit.
Similarly, you should let git know about files you want to delete, rename or move
in order for it to track them properly.
The syntax of the commands is similar to the linux counterparts
with prepended git
.
To remove, rename or move files from the working directory and from the subsequent commit, use, respectively
git rm file(s) git mv file newname git mv file(s) newdirectory
To export the changes in the local repository to the remote repository.
git push [origin branch]
This operation will fail if the remote state is not in the history of local state (rather: branches). In that case, one has to resolve all issues on the local repository by pulling the remote changes into the local repository. As soon as the repositories are in compatible states the push operation will succeed.
The changes will be seen by others and therefore cannot be undone anymore.
The optional destination parameter origin branch
overrides
the standard remote target branch.
Normally, git considers all contents of the working directory as part of the repository. However some files are automatically generated, binary, big, backup and/or log files which are not meant to be tracked in the repository. git will notice extra files in the working directory which have not been incorporated into the repository, and warn about their presence.
One can mark certain files or classes of files
to be ignored by git.
A list of these files is stored as in the file
.gitignore
(typically in the main directory)
which itself is a file of the repository
and must be added explicitly.
In latex projects one might exclude files
such as *.pdf, *.log, certain subdirectories.
At the same time one would like to include
figure files such as Fig*.pdf.
A sample .gitignore
file might look as follows:
#exclude generated files *.pdf *.aux *.log #include figure files !Fig*.pdf #some excluded directories /extra
For small projects with a handful of files, it might be more suitable to exclude all files by default, and include individual files explicitly, as in the following example:
#exclude all files * #include ignore file !/.gitignore #include source file(s) #add any relevant file that is not generated !/paper.tex #include figure files !/Fig*.pdf
Alternatively, exclude/include rules can be specified in the file
.git/info/exclude
, which has the same format as the
.gitignore
file. The difference is that
.git/info/exclude
is not shared with other repositories,
it remains local. Hence it is useful for excluding files that only
exist in your personal working directory (for example files that you
use to generate figures). Note that rules in .gitignore
take
precedence over rules in .git/info/exclude
.
The pull operation is equivalent to fetch and merge or fetch and rebase.
git pull [--rebase] [origin branch]
After the initial fetch, pull works in the local repository only. Changes introduced by merge or rebase must (eventually) be pushed to the remote repository.
Without parameters, the standard remote branch is pulled (typically master
).
For a project with many branches (see above)
one will typically pull a specific branch to merge their states.
To download updates from the remote repository to the local repository.
Invoked implicitly by git pull
or explicitly by:
git fetch
By default, all remote branches are fetched.
Merge is the default behavior for pull. After an explicit
git fetch
, merge can be invoked as
git merge otherbranch
which replays all changes introduced by otherbranch on top of the current branch, creates a new commit that reflects the combined changes, and updates the working directory accordingly. The new commit will have both previous branch heads as parents. The head of otherbranch remains unchanged, while the head of the current branch gets updated to the new commit.
Rebase is similar to merge, but it changes the history of the current branch
such that it will be based on the head of another branch.
The current branch is effectively detached from
the common ancestor of both branches, and gets appended to the head of the other branch.
All the changes within the current branch are rewritten
such that they will refer to the head of the other branch.
This creates a linear history
(with only single parent commits) which may be easier to trace back.
After an explicit git fetch
, it can be called as
git rebase otherbranch [mybranch]
This command rebases mybranch on top of otherbranch. When mybranch is omitted, it rebases the current branch on top of otherbranch.
If one prefers rebase over merge, one can make rebase the default behaviour for pulling the branch branch by
git config branch.branch.rebase true
Note that you should never rebase a commit that has already been pushed to a remote repository. Such a commit may already be in use by other contributors. By modifying it, you are bound to create a mess. Hence a rebase should only ever be applied locally.
A pull operation will have one of the following results which leave the local repository and working directory in some state:
When a pull request has detected changes that require manual resolution, the working directory will be in conflict state. The conflicts should be resolved and afterwards the changes must be committed (merge) or the rebase continued (rebase).
The conflicting files will contain sections marked by:
<<<<<<< first version tag text in first version ||||||| common ancestor version tag text in common ancestor version ======= text in second version >>>>>>> second version tag
These sections should be edited to represent the desired final state. The markup must be removed by hand.
The common ancestor may be useful in resolving the conflict. It is not displayed by default; it must be enabled by
git config [--global] merge.conflictstyle diff3
Git can also invoke a graphical diff tool to resolve the conflict more intuitively:
git mergetool [--tool=meld]
Once all conflicts have been resolved, the operation must be completed.
For a conflict during merge, the working directory reflects the intended merge commit. Hence, one should commit the working directory
git add file(s) git commit ...
or instead of adding individual files, may use
git commit -a
.
For a conflict during rebase, the working directory reflects the commit to be modified to fit the new history. The rebase operation should be continued with
git add file(s) git rebase --continue
Further commits to be rebased may follow.
Note that all changes due to a merge or rebase are local and have to be pushed (eventually).
If unclear about the present status of the working directory:
git status
This lists the conflicting files:
git ls-files -u
There is also the option to give up on the merge or rebase operation in progress and revert to the state before the pull. For a merge operation use
git merge --abort
alternatively
git reset --hard
For a rebase operation use
git rebase --abort
For example, this is a useful option if some half-finished manual resolution cannot be undone otherwise. However, all changes will be lost.
If you find no way to resolve a conflict, you can upload your changes
to the remote repository in a new branch, so that someone else can
take care of integrating your changes. For this purpose, first reset your
repository to a sane state with git merge --abort
or
git rebase --abort
. Then create a new branch head
named helpme (or anything else sensible) with
git branch helpme
Now push your new branch to the remote repository with
git push origin helpme:helpme
This creates a copy of your newly created branch on the remote repository.
Once someone else has merged your changes into the master branch,
simply download the update via git pull
, and delete your
temporary branch with
git branch -d helpme
Note that git branch
creates a new branch label, but does
not switch to it. It is not necessary to switch to the new branch for
uploading it to the remote repository.
To figure out what a command does and which options it takes:
man git command
or, depending on your installation,
man git-command
will show an extensive description of the git command
command. In addition, there is the book at
http://git-scm.com, as well
as answers to almost every possible question at
stackoverflow.
A very helpful option is to add color to all git output with
git config --global color.ui auto
To create a new branch:
git branch branchname [startpoint]
creates a new branch named branchname at commit/branch startpoint. If startpoint is omitted, the new branch will be created at the current branch head.
To switch to the new branch, do
git checkout branchname
To create and switch to the new branch in one go:
git checkout -b branchname [startpoint]
Branches can be configured to track other
branches. The branch that is tracked is called the tracking branch. It is the branch that gets
merged or updated when git pull
or git
git push
is called without arguments. When you clone a
remote repository, your master branch by default tracks the remote
master branch (which is labeled by
origin/master
) in your local repository.
To set origin/remotebranch as the tracking branch for your
existing branch localbranch, do
git branch --set-upstream-to=origin/remotebranch localbranch
You can check your branch configuration, including the tracked branches, with
git branch -vv
Tags are a way to mark a certain status (milestone) of your work with a special label. For example, when you upload a paper to the arXiv, you might want to tag the corresponding commit with arxiv-v1 by
git tag arxiv-v1 [commit|branch]
If specified, commit or branch get tagged. By default, the current head gets tagged.
You can show the history of your repository with
git log
The output of this command can be configured in many ways. For example
git log --graph --all --decorate --pretty=format:'%C(auto)%h %d %s %C(blue bold)<%an> %C(cyan)(%cd)' --date=relative -20
shows the latest 20 commits, nicely formatted.
To avoid typing long commands over and over again, git lets you specify aliases. For example, to specify an alias sl (for (s)hort(l)og) for the above log command, do
git config --global alias.sl "log --graph --all --decorate --pretty=format:'%C(auto)%h %d %s %C(blue bold)<%an> %C(cyan)(%cd)' --date=relative -20"
Now a simple git sl
shows you the nicely formatted recent
history. Git aliases allow you to execute arbitrary commands, so you
can get very creative. For example, after a
git config --global alias.al "\!git config -l | grep alias | cut -c 7- | sed 's/=/\t/'"
you can see all your defined aliases with a simple git al
.
Aliases are stored in git's global config file, which can be edited with
git config --global -e
Exercise: Create an alias ec for this command.
Simetimes you want to check out what someone else has just done while you are in the middle of editing yourself. Instead of saving your half-way edit into a dedicated commit, you can save it into the stash with
git stash
This saves your changes and resets your working directory to the previous commit. You can now pull/merge/rebase, or do further editing. Later on, you can reapply the changes that you had stashed away with
git stash pop
In case applying the stash fails due to conflicts, you need to resolve the conflicts. Afterwards you can delete the stash with
git stash drop
If the stash applies cleanly (no conflicts), git stash pop
implicitly does git stash drop
.
You can automate steps in your workflow with so-called hooks.
Hooks are arbitrary shell scripts located in .git/hooks/
that get executed everytime a specific git command is called. Examples
that include usage instructions should already be located in your
.git/hooks/
directory.
For example, you could install a script
.git/hooks/post-update
that automatically compiles your
latex document everytime the repository gets updated by a
commit/pull/merge/rebase.