Getting git

I highly recommend getting a new git installation. Git itself is pretty stable (that is, you probably will not run into bugs with whatever version you have installed), but there are many usability related improvements. Specifically, I am using 1.7.x and it is likely that some things in this document are specific to that version.
You can download a recent version, available in binary form for several popular platforms. In addition to these, you can get a build for
  • Ubuntu:
    sudo add-apt-repository ppa:git-core/ppa
    sudo apt-get update
    sudo apt-get install git-core
  • OSX using macports:
    sudo port selfupdate
    sudo port install git-core +svn
(For OSX, you can also get GitX — it's a good gui front-end for git, similar to gitk and git gui.)
You can also build git from source is — here are the steps that I'm using to install a new version:
cd /tmp; curl $BASE/git-$GVER.tar.gz | gunzip | tar xf -; cd git-$GVER
make prefix=$TARGET all && sudo make prefix=$TARGET install
If you do this and you want the man pages too, then getting the pre-built man pages is the easiest route (building them requires some “exotic” tools):
cd $TARGET/share/man
curl $BASE/git-manpages-$GVER.tar.gz | gunzip | sudo tar xf -

General git setup

Commits to a git repository are done locally, so git needs to know who you are. (Unlike subversion, where you need to identify yourself to be able to talk to the server, so the commit object is created there based on who you authenticated as.) To get git to know you, run the following two commands:
git config --global "My Name"
git config --global ""
This sets your default name and email for all repositories — it stores this information in ~/.gitconfig which is the global configuration file, used for such defaults. You can edit this file directly too — it is in a fairly obvious textual format. There is a lot that can be configured, see below for some of these (and see the git-config man page for many more details).
In addition to this file, each repository has its own configuration file (located at .git/config). Whenever git needs to check some option, it will use both the repository-specific config file (if you're in a repository) and the global one. The --global flag above tells git to set the option in the global file. Note that a configuration file cannot be part of the repository itself — so when you get a repository, you still need to do any local configuration you want. (This is intentional, since the configuration file can specify various commands to run, so it avoids a major security hazard.)
Important: this sets your default identity name and email for all repositories. This may be a problem if you want to commit to different git repositories under different identities. See the section on customizing git below for more details on this.

SSH setup

Since git is a distributed system, you can do everything locally on your own repository, but obviously, the goal is to communicate with other people so you'll need to push these changes somewhere else. The most popular way to communicate with remote repositories — including repositories on the PLT server, is via ssh. (Access is controlled via a tool called “gitolite” — more on this below.) The username and hostname of the server is — and you should be able to connect to this account using the ssh identity key that corresponds to the public key that you use with the git server. To try it, run
and the server (gitolite, actually) should reply with information about your current permissions. The exact details of this is not important for now, just the fact that you were able to connect and get some reply.
Using an ssh configuration file (usually ~/.ssh/config), you can set up a short name for the server. For example, you can have this:
Host pltgit
  User git
and now you can simply use ssh pltgit info instead of the last example: ssh will know that pltgit is actually defined as
This is the preferred way to set things up: besides being more convenient in that you need to type less — it is also a useful extra level of indirection, so if the server settings ever change (for example, we might switch to a non-standard port number), you can simply edit your ssh config file, and continue working as usual. In addition, such a configuration is needed if you created a specific ssh identity file to be used with git — specifying an alternative identity file on the ssh command line is possible (an -i flag, in the case of openssh), but remember that most of your interactions with the remote server are done implicitly through git. (It is possible to configure how git invokes ssh, but it is much easier to just configure ssh). In this case, you will have:
Host pltgit
  User git
  IdentityFile ~/.ssh/my-plt-git-identity-file
In addition to an ssh configuration file, git also has a way to create prefix shorthands. For example, if you use this configuration:
git config --global foo:
then whenever git expects a repository URL, it will replace foo: with, for example:
git clone foo:bar
While it is possible to use this instead of an ssh config file to access the plt repository, the former is preferable. The reason for that is that you will also interact with the server directly via ssh commands (described in the following section). Keeping the alias in your ssh configuration means that you will use the same alias for both git commands and other ssh-based commands. You may still want to use it for other servers, specifically, here is a popular setup for github (this is configuration text that you can paste into your global .gitconfig file):
[url "git://"]
  insteadOf = github:
[url ""]
  pushInsteadOf = github:
  pushInsteadOf = git://
It translates github: to a github read-only git:// URL, and it translates pushes to the same prefix to use github's ssh URLs. Note that it also translates the read-only git:// url to an ssh url for pushing. (The same setup can be used for, to deal with github gists via git.)

Gitolite: the server's gateway

All access to the PLT server is done via ssh, and this is where gitolite comes in as the “who can do what” manager. What actually happens on the server is that no matter what command you're trying to run (as you usually would, for example: ssh somewhere ls), the server has settings that make it always run its own command — and that is a gitolite script. The script knows the command that you were actually trying to run, and it will reply based on that. In the above ssh example, you're not specifying any command (so if it wasn't for the pre-set gitolite script, you'd be asking for a remote shell to start), and gitolite responds by telling you about your permissions.
This is actually the info command, so you get the same reply with ssh pltgit info. Again, this connects to ssh and tries to run info; gitolite sees that you're trying to run info, and instead of running it, it responds with that information. There are a few additional commands that you can use this way — these are all “meta commands” in the sense that you're not interacting with a git process on the other end, but rather get gitolite to perform various tasks on your behalf. You can run the help command (ssh pltgit help) to see a list of available commands. They are mostly useful in dealing with your private repositories on the server, which will be discussed further below.

A (very) quick introduction to git

This is a quick description; see the last section for more resources (specifically, Git for Computer Scientists covers these basics well). Understanding how git models and stores data will make it significantly easier to work with it.
A git repository is actually a database of a few kinds of objects, which form a DAG. There are only a few of these kinds of objects, and they are all addressed by the SHA1 checksum of their contents. You will generally see a lot of these SHA1 strings (40 hexadecimal characters), since they form a kind of a universal address for such objects. (For convenience, any unique prefix of a SHA1 can be used with git commands when you need to refer to it.) Whenever the following descriptions mention a pointer — this is actually such a SHA1 hash.
  • A blob object is a generic container for any information, which (usually) represents a file. This object has no pointers to any other objects. It does not have anything except for the actual contents: no name, permission bits, etc.
  • A tree object represents a directory hierarchy: it contains a list of names, and for each name a pointer to the object that is its contents. Some of these will point at blobs (when the tree contains a file), and some of these will point at other trees (when it contains a sub-tree). (These objects are similar to directories in a file system in that they contain all “meta” information on files: their names and permission bits are kept here.)
  • A commit object represents a versioned snapshot of a tree, forming a line of work. It has the following bits of information:
    • tree: a pointer to the tree object that was committed
    • parent: a pointer to the previous commit, which this one revised
    • author: the identity of the commit author (name, email, date)
    • committer: the identity of the committer
    • the text of the commit message (which can be arbitrarily long)
    The parent field is actually any number of parents: there will be no parents if this is the first commit in the line of work, or more than one parent if this is a “merge” commit that merges two lines of work. Furthermore, there is nothing that prevents a git repository from having completely separate lines of work — in fact, you can have several independent projects contained in your repository.
    (Note that git distinguishes the author of a commit from the person who actually performed the commit, for example — a patch could be created by X, and sent to Y to be committed.)
  • Finally, there is a tag object, which is very roughly a pointer to another object (almost always a commit), and is not important for now.
The fact that all of these objects are addressed by the SHA1 hash of their contents has some immediate important implications.
  • Since SHA1 are cryptographic checksums, they can be considered unique for all practical purposes.
  • The git repository is inherently hash-consed: you can never have “two identical files” in git — because a file is stored at its SHA1 hash, two identical files will always be stored once. (Note that the name of a file is stored in the tree that contains it, so the SHA1 of the contents does not depend on it.) The same holds for two trees: if you have two identical directories (same contents of files, same names, etc), then there will actually be only one tree stored in the repository.
  • Furthermore, these addresses are actually global: any two repositories that hold a file with the same contents will have it at the exact same SHA1 hash. (For example, if I have a repository that contains several projects, and each project contains several copies of the same LGPL text, then I'll have only a single blob object with that contents.) This is not only making the store efficient, it also makes it possible to refer to an object by its hash — for example, you can refer to the SHA1 of a specific file at a specific version in an email, and this will have the exact same meaning for anyone that reads the file (eg, anyone can run git show SHA1 to see that file). (This does require that the readers have the actual object in their repository, of course — but no mistakes can happen, statistically speaking.)
  • This holds for commits too: since a commit has the SHA1 of the tree it points to, then the commit SHA1 depends on the tree it points to. More importantly, since a commit object has the SHA1 of its parent(s), then the commit depends on them. This means that “replaying” a number of commits on a different parent commit (eg, when doing a “rebase”) will always result in a separate line of commit objects. These SHA1s are also global, meaning that talking about a specific revision by its SHA1 will always refer to it unambiguously (as long as others have that object in their repositories).
  • By itself, this kind of storage cannot have any reference cycle. (At least there is no practical way to get one.) The storage is therefore inherently a DAG. In addition to this object store, git does have a number of external references (eg, a branch is actually a pointer to a SHA1) — and those could be arbitrary, but the object storage itself cannot have cycles.
  • The fact that a commit has a pointer to a tree is what makes git keep revisions of the whole tree — a commit cannot mark a change to a subtree. (At least not with the usual higher-level commands that git implements.)
On top of this object store, there is a layer of meta-information about it. The most important component here are branches (and tags). A branch is basically a file that has the SHA1 of a specific commit (for example, your master branch is a SHA1 that is stored in .git/refs/heads/master). This is what makes branch creation extremely cheap: all you need to do is create a new file with the SHA1.
In addition, the HEAD (where your working directory is currently), will usually have a symbolic reference rather than a SHA1 (you can see this symbolic reference in the .git/HEAD file, which should usually look like ref: refs/heads/branch-name). When you commit a new version, a new commit object is created, and the branch that the HEAD points to is updated. It is also possible to checkout a specific SHA1 of a commit directly — the result of this is called “detached HEAD”, since the HEAD is not a symbolic reference. The possible danger in doing this is that git commit will create new commits that are derived from the one you're on, but no branch is updated; if you later checkout something else, no reference is left to your new commit which means that it could be lost now. For this reason, if you checkout a SHA1 directly, git will spit out a detailed warning, including instructions on how you could name your current position (create a branch that points there).
Tags come in two flavors: lightweight tags are SHA1 pointers like branches. The problem with this is that such a tag could easily move to a different commit, which is considered bad practice. For this reason, there are also “annotated tags”, which are tag objects that are created in the object store. These tags contain information that is similar to a commit (there's the tagger's identity, the commit that it points to, and a log message) — and they are reliable since you can refer to their SHA1. In this case, the symbolic reference for such a tag (its name) will point to the tag object in the store (it is also possible to move it, but that would also be bad practice). Furthermore, tags (of both kinds) can point to any object in the store — they can point to a tree or even to a specific blob. This is sometimes used to store meta-information (eg, web pages) inside the repository. (The repository for git itself has a tag that points to a blob holding the maintainer's GPG key.)
Note that all of this is under a more high level of managing information between branches and repositories, with push/pull being the main operations at that level. A high-level overview (more below):
  • a branch is a line of development, represented as a pointer to the commit at its tip;
  • branches can be organized into hierarchies using / as a separator;
  • some branches are local, and some are remote — remote ones are named remotes/origin/branch;
  • local branches are represented as files in .git/refs/heads/branch and remote ones are in .git/refs/remotes/origin/branch;
  • origin is just the conventional name for the original repository you cloned — later on you can add new remote repositories so you can push and pull to/from them conveniently;
  • some local branches are set to track remote ones, usually (but not necessarily) the two will have the same name;
  • you can also have local branches to track other local branches (with pushing and pulling happening inside your repository);
  • git fetch is used to update your remote branches — ie, connect to the remote repository, get new commits (and the required parents and trees), and update your remote branch with the new tips;
  • git merge and git rebase are used to update one branch with commits on another;
  • git pull is, roughly speaking, a convenient way to do a fetch followed by a merge (or a rebase, when used with --rebase).
There are several git tools that are relevant here. These are not commands that you need to know for everyday use — so you can ignore this part. It's only relevant if you want to see more of the low level structure (or maybe if you want to write code that interfaces with a repository at this level).
git show SHA1
Show the object, in some appropriate way based on the type of the object. (For blobs it shows the contents, for trees you get a listing of its contents, and for commits it shows the log and the patch.)
git cat-file {-t | -s | type | -p} SHA1
A more low-level command that tells you the type/size of an object (-t and -s), or shows the contents of an object as-is when given a type. -p will “pretty-print” the object, eg, showing the contents of a tree object instead of dumping its binary encoding.
git gc
Starts from a rootset holding all known references (branches, tags, etc), and collects dangling objects. Such objects are generated due to various reasons — for example, rebasing means that new commits are generated, and the old ones are kept around. Actually, this will not remove recently referenced objects — there is a protection mechanism that keeps them around for a while, so if you somehow mess things up there is still a way to recover.
git fsck
Does a “file system check” on the repository.
git rev-parse symbolic-name
Prints out the full SHA1 of a symbolic name (eg, a branch name or a tag name). Will also print out the SHA1 given a possibly short prefix of one. (Actually, this command can also show other information about a repository, which makes it an important entry point for programs that deal with a repository.)

Clone the PLT repository

As you probably know by now, in git you don't checkout a repository — you clone it, getting a copy of the complete repository you cloned. This includes the object store and the various references (branches and tags). There are several ways to get the PLT repository, but the one that is relevant to work on it is to do so through ssh — since this allows pushing changes back to the server. (It is also possible to clone from one place and push to another, but if you start with cloning through ssh your clone will be already set up to push changes back.) The information that gitolite gives you (with ssh pltgit info, assuming the above ssh setup) includes two repositories that you have write access to: plt is the main repository, and play is setup similarly (intended to try things out, see the “Fooling around” section below). To get the main repository, run
git clone pltgit:plt
which will create a plt directory with your new clone. You can now start working in this directory.
The repository is also available from other sources, some can be used for read-only cloning:
  • git clone git://
    cloning the repository using git's own network protocol
  • git clone
    clone the repository over http
  • git clone
    this uses the repository mirror on github, which is automatically kept in sync (you can also use https://...)
and some present a web interface for additional information:

Start working: git commits vs subversion commits

As seen in the previous section, you start with
git clone pltgit:plt; cd plt
And now you get to actually do some work.
For the normal cycle of operations, working with git is not all that different from working with subversion — you would change some files, and then:
git commit some/paths
git commit some/paths -m "add some feature" -m "requires another"
only now the commit lives only in your clone only, not in the server (which is why committing is blindingly fast, not requiring a network connection). To push your commits to the server, run git push, and to pull updates from the server run git pull. This is obviously very much oversimplifying the process: mainly neglecting to talk about updates on the server when you already have local changes. (See below for a more detailed explanation.) Note that in these examples I'm explicitly specifying the paths to commit, either the files that you want to commit or a directory where you want to commit all changes. See the section below on the “staging area” for more details.
One major difference to keep in mind is that git commits are not like subversion commits. (This is confusing since many places that discuss the difference between the two and/or try to teach git to subversion users almost always work under the assumption that commits in the two systems are the same.) The thing is that git commits are done at a finer level than subversion commits — since a commit is done locally and not on the server. To really imitate how subversion works, you would push all commits right after you create them — essentially equating commits with pushes, which is how you work with a subversion repository. But by just not doing this, you will immediately get some of the benefits that git gives you. So a better way to think about it is: in git you commit at points that make sense for the respective changes, usually at a finer level than subversion commits. Then, you push back a bunch of commits to the server — whether one or a hundred. The point where you push your changes to the server is effectively the point where you decide that you're in a good enough state to make your work public.
Incidentally, following this intuition, drdr is running a build for every push to the server — not for every commit. When you push to the server, it will tell you which push number this is — these numbers are going to be used by drdr, and they (very!) roughly correspond to subversion commits. (Currently, every push gets a number, but in the future this might be used only for pushes to the master branch.) There's no plan at the moment to use these numbers for anything else.

Fooling around with git

Experimenting with git is easy to do, and the server is set up to make it even easier. You can use one of the following ways to experiment safely with the main repository:
  • There is a play repository on the server. This repository is very similar to the plt repository, and it is set up in the same way that plt is. Feel free to destroy it in any way you want, even if it becomes unusable, it's easy to just recreate it.
  • You can create (and later delete) your own repositories — including making your own copy of the main repository, an operation that is known as “fork”. Your fork will be created efficiently (ie, creating a fork of the plt repository is cheap), but any changes made to it will not affect the main repository. A fork is created with a gitolite command, and once it's there you can clone it and eventually delete it. Here are the relevant commands — use your actual username in place of $user (or have $user set to your username):
    1. ssh pltgit fork plt $user/myplt
    2. git clone pltgit:$user/myplt
    3. with this clone, push, pull, etc...
    4. ssh pltgit delete $user/myplt
    More on user repositories below.

The staging area

Something that tends to confuse people is git's “staging area”. This is a concept that is unique to git — roughly speaking, you can have three versions of a tree:
  • the files that you actually see (and edit) — the working directory,
  • another is the staging area which you add stuff to from the working tree using git add,
  • and then there is the tree that is in the HEAD with all prior versions.
The thing that can confuse here is that when you git add some/file for a file that you edited (or created) and then edit it further, then the version will get committed by a plain git commit will be the one that was added. Note that git status will tell you which modifications are in the staging area waiting to be committed, and which modifications are in your working directory — in the given example, it will tell you that some/file is in both.
The staging area can be useful at times, but most likely at the beginning stages you will want to just avoid it. The good news is that it is easy to do so.
Avoid the -a flag.
Before we see how to ignore it, note that there are many web pages that will tell you to use -a with the commit command. This will make git commit all changes to tracked files — including tracked files that are outside of your current directory, and this can make you commit changes that you didn't intend to commit.
Always specify a path to git commit.
The easiest way to avoid the staging area is to specify the path(s) to what you want to commit, possibly . for all changes in the current directory (and below). Specifying a path this way will make git commit behave very similarly to subversion: tracked files that were modified will get committed, and added files (with git add) that are listed in the paths-to-be-committed are also committed. Tracked and added files that are not listed (and not in a specified subdirectory) are left as-is. So, if you had a habit of doing this with subversion (svn commit .), then git will essentially do the same. You will still need to use git add for newly created files, but this is essentially the same as with subversion.
It is also possible to “make up new git commands” for yourself. See the following section on the subject: it adds a new git ci command that passes . to git commit, similarly to what svn ci does by default.
It is a good idea to avoid using the -m flag, until you're more comfortable with git.
Let git pop up an editor to write the commit log: the file that you will edit will list the changes that you are about to commit as well as changes that you are not going to commit. Glancing through it, you will see changes that you missed, furthermore, the paths are relative which makes it easy to quickly distinguish paths in the current directory and outside of it (the latter will begin with ../). If you see any problems, just make sure that you have no commit message in the editor and when you quit it git will abort the commit (same as subversion).
Don't push out all commits to the server immediately.
Even if you did commit something by mistake, it is possible to undo the commit — run git reset HEAD^, which will undo the last commit (it moves the branch to the parent commit), and the changes that are no longer committed will be left in your working directory, so you get to try the commit again. Note that this is possible only if you didn't push out the commit that you're undoing — if you did, then the server will later not allow you to push changes that are not strict extensions of what it has (since this is likely to confuse other people who already got your commit).
So in general, remember that you can commit often, and commit when it makes sense to do so, and push commits out only when you're done with whatever you were working on. Consider your local git history as something that you have full control over: you can undo commits and redo them (in fact, git commit --amend does just that: undo the last commit, and combine it with new changes — it's a solution for “oops commits”), you can rebase them, and you can just throw away everything and start from scratch. But when you do push your history out, the party is over, and any mistakes will need to be rectified in further commits (eg, you can no longer use that --amend flag, you have to do an “oops commit”).
(BTW, strictly speaking, it is only the policy on our server that forbids such rewritten history — since this is likely to be a mistake for now, and if it happens most people will be confused about what needs to be done.)
Also, as said above, pushing all commits immediately means that you're essentially restricting yourself to the same mode of operation as subversion. Same mode, but more complicated — and you won't enjoy any of the benefits, which will guarantee that you will suffer.

Configuring and extending git

As mentioned above, git uses several configuration files that customize various aspects. The two important ones are your global file (~/.gitconfig) and a per-repository file in .git/config at the repository root — whenever a value is needed, git will first consult the repository configuration, and if the option is not set there it will try your global version. (It will also look at a system-wide configuration, but this is irrelevant here.) Configuration options names are separated by a ., and configuration files have a simple syntax, with option listed as a bar = some value line in a [foo] section. Note that you can set any configuration you want to, no restrictions. This can be useful for customizing various extensions, including scripts that you may want to write (for example, the git server has a script that checks the hooks.counter option to know if it should keep track of pushes). This is facilitated by several options to git config which makes it easy to query configurations from scripts etc.
To edit your configuration options you can either use the git config command, or you can edit the file directly. When git config changes the file it rewrites only part of the file and leaves the rest untouched, which conveniently leaves your own format and any comments you might want to include. To set a value through the command and then get it:
git config some-value
git config
and you can add a --global flag to either form to use only your global configuration file. There are many other options — for dealing with keys that must be booleans or integers, for keys with multiple values, etc. The git-config man page will tell you much more on this.
The man page also lists the configuration options that customize various functionalities. Here are some important ones that you should consider setting (each listed as a command that sets it globally):
  • git config --global "My Name"
    git config --global ""
    As said in the beginning of this text, you will likely want to set a default username and email for yourself. But note that if you do set this globally, it will be your default identity for all repositories. This makes sense only if you commit to PLT-related repositories, but it can be confusing if you're also committing to some other non-PLT-related repositories and want to commit under a different email (or name) — for example, you may want to commit to a public project with a gmail address, and to a departmental repository with your email.
    You could set the identity locally in your PLT clone or you could set your other identity in the other repository, but in any case you should be aware of this and avoid letting git guess your name and email. (Some confusion is likely to happen anyway, and git has a way to “map” some name/email to another when mistakes happen.)
  • git config --global push.default simple
    By default, when you run git push, git will push all branches that correspond to branches in the remote repository. This can be surprising if you're working on several branches since it will push them all out. Setting this option to upstream will make git push the current branch to the branch it is tracking, and on newer git versions setting it to simple is similar except that it will refuse to create a remote branch. (And on really old systems, you should use a value of tracking, which is the old and now-deprecated synonym for upstream.)
    Another option for this is current, which makes git push always push the current branch to the remote it was cloned from. This is convenient in that you never need to set up how local branches track remote ones — it's as if all local branches always track all remote branches under the same name. For example, after you clone an empty repository (see the user repositories section below), a git push will push a master branch remotely — whereas with tracking you need to have the first push explicitly specify the branch to push, usually git push origin master (this sets things up so later you can just run git push). However, using current you can no longer push from one local branch to another local branch it is set to track.
    So a possible conclusion here is that you should use tracking, unless you plan on branches to always track remote branches by the same name. tracking is often preferred over current.
  • git config --global core.excludesfile "~/.gitignore"
    You'll probably want to always ignore a number of common patterns, like backup files or OSX .DS_Store files. To do this, you first set a default file as shown here (note that ~ is quoted, and git will expand it to whatever your home directory is). If you have this setting, you can then create a file at this path with patterns for files that you always want to ignore. This file has shell-patterns (and possibly #-comments) — for example:
    # backups
    # autosaves (note the #-quoting)
    # OSX junk
    (See the gitignore man page for a few more details.)
    In addition to this, git repositories can have their own .gitignore files (unlike .git/config files), which are combined hierarchically together with this global option. In fact, you don't really need to set the above ignores for the PLT repository since they're already included in its toplevel .gitignore file — but doing so is still a good idea since you're likely to work on other repositories too.
  • git config --global core.editor emacs
    git config --global core.pager less
    These two settings are used to tell git which command to use for editing log messages, and which command is used to paginate output. (The former might already be set in your environment as the value of the EDITOR variable.) If you set the latter to cat, git will just spill all output directly. In addition to these, you can also control which individual commands use a pager, for example, to disable the pager for git log, you can do this:
    git config --global pager.log false
  • git config --global color.ui auto
    git config --global color.branch.current yellow red bold
    git config --global color.branch.local   yellow
    git config --global color.branch.remote  green
    These settings control how git uses colors: whether it shows them, and which colors it will use for various outputs. There are many of these settings, which you can find in the git-config man page.
  • remote.origin.*, branch.master.*
    Git keeps track of what the origin repository and how branches track other branches in the configuration file too. (You will have such entries for all known remote repositories and branches.) Usually you set these values (often implicitly) via various git commands — but you might want to look in your configuration file if you want to tweak things yourself. Note that for configuration names with more than two parts, the section name will something like [remote "origin"].
  • sendemail.identity, sendemail.from, sendemail.bcc, sendemail.suppresscc, ...
    These settings configure git send-email, which is used to send patches from your repository elsewhere. You will probably want to customize them if/when you get to use this facility often. (See below.)
In addition to these settings, you can extend git with your own aliases and commands. Aliases are stored in your git configuration — so you can use git config to create an alias, for example,
git config --global
alias.up "pull --stat --all"
creates a global git up command which is actually a short alias for running git pull --stat --all.
Notes about aliases:
  • To edit aliases, it is more convenient to edit your configuration file directly.
  • Since aliases are stored in git configuration files, they can be made local to each repository.
  • When command-line arguments are given to the alias, they will be appended to the alias text.
  • Aliases cannot override git commands; this is intentional, to avoid scripts breaking due to modified commands.
  • An alias that starts with a ! character will be run as a shell command. For example, you can use
    k = "!gitk -d"
    to make git k run the gitk program with the -d flag.
  • Some aliases that I find useful are:
    # satisfy the "up instinct"
    up = pull --ff-only --stat --all
    # quick status, similar to what subversion shows
    st = status -s
    # we will be dealing more with branches
    br = branch
In addition to aliases, you can create new git commands using a script that is called git-something somewhere in your path. (Note that these cannot override known git commands either.) Such commands will be available as git something. One use for this is using our facility for managing file properties — the collects/meta/props program. To do this put this in a file called git-prop somewhere in your PATH, for example, ~/bin/git-prop:
top="$(git rev-parse --show-toplevel)" || exit 1
exec "$top/collects/meta/props" "$@"
then run chmod +x ~/bin/git-prop, and you can now use it as a git command (try git prop -h). Note the use of the rev-parse command: it will display the repository root, which means that you will get to run the props script of the repository you're currently using. (There are many git commands that are useful for such scripts.)
Another useful script is git-ci which mimics the behavior of svn commit (avoiding confusion with the staging area). As said above, a good way to achieve this is to specify the current directory (.) if you don't specify any other path. If you save this as git-ci, you will get a git ci that does just that:
add_dot=yes; for p; do if [ -e "$p" ]; then add_dot=no; fi; done
if [ -e "$(git rev-parse --git-dir)/MERGE_HEAD" ]; then add_dot=no; fi
if [ -d "$(git rev-parse --git-dir)/rebase-apply" ]; then add_dot=no; fi
if [ $add_dot = yes ]; then git commit . "$@"; else git commit "$@"; fi
This small script will basically check all arguments and see if one is an existing path (or if you're resolving a merge). If none are, it adds . as a first argument (this avoids confusing it as a value for some flag). Note that this is not completely foolproof: for example, if you'll use the -m . hack, it will assume that you did specify a path. (But you should really avoid such log messages.)

User repositories

As mentioned above, the PLT server allows you to create your own repositories. Repositories on the server can be organized in a nested directory structure, and you “own” all repositories that are in a directory with your username. The gitolite info command that was mentioned above shows you this with a C CREATER/.* line: this means that you can create any repository if it is in a subdirectory with your username. (In this discussion, “the server” is actually the gitolite script that runs on the server.)
Any git operation that you do on a repository that you own which does not exist will make the server create it for you — for example, if you clone such a repository. To run these examples use your git username instead of $user, or simply set the user variable in your shell (as this example shows) — this is only to make copy-pasting easy, of course.
user=eli # your own username here
git clone pltgit:$user/foo
(You are encouraged to run these commands — at the end of this section you'll see how you can clean things up.)
What will happen now is (a) git will initialize a foo repository for you, (b) it will connect to the server to clone its contents, (c) the server will notice that it doesn't exist so it will create it, (d) your git process will clone the empty result. Because the remote repository is empty, git will complain that “You appear to have cloned an empty repository” — this is expected, so you shouldn't worry about it. Once you have your (empty) clone, you can populate it as usual, then push the new content back to the server's copy:
cd foo
...create some files...
git add .
git commit -m "initial content"
git push origin master
Note that the last command explicitly names the branch to push over — once this push is done, git will remember this relation and further pushes can be done with just git push. If you happen to forget this and use git push, then git will not push anything, and it will tell you about it and suggest specifying a branch. On the other hand, if you set the push.default configuration option to current (as described in the customization section above), then even in this first push you can just run git push since git assumes that you always want local branches to correspond to remote branches by the same name.
Instead of cloning a repository to create a new one, you could also start with an existing repository and simply push it to a yet-to-exist repository on the server. Again, the server will see that it doesn't exist and will create it for you (provided that it is in your directory). To continue the above example, I could now create a new repository from foo:
# (still in the foo directory)
git push pltgit:$user/bar
There is, however, an issue of efficiency here: with this last command I just created a second copy of it all. This could be problematic if you have a large repository (eg, a copy of the plt repository). (Note that with subversion this is the only way to do things, but there you would create copies inside the tree, which subversion optimizes.) One nice feature of git is that creating a clone of a repository on the same filesystem will use hard-links for the clone, which makes the clone use very little additional space. But the problem is that you have no access to the PLT server. The solution here is in the form of a gitolite fork command (this is actually our own extension) — this command will create a clone on the server, starting from a specified repository. I could therefore create my bar repository as a copy of foo with the following:
ssh pltgit fork $user/foo $user/bar
(Note that if you follow these examples and you already have bar, the server will tell you about it.) The result is a $user/bar repository that was cloned from $user/foo, and the two share their store using hard links. If the two repositories are updated with identical content, the new content will not be shared, but for a large repository like the plt repository you still get the benefit of having the bulk of the data shared (the complete store, at the time of forking). To get a feeling for how fast this is, you can now clone the plt repository to your own private copy:
ssh pltgit fork plt $user/myplt
This would seem suspiciously fast for such a large repository — but this repository has most of the data packed (objects in the store are put in large “pack files”), so there are not too many files, and the server-side cloning basically created hard links to these files. The result is fast, efficient (even in speed: when you interact with your clone, files are likely to be paged in memory), and cheap.
As we've seen above, the gitolite info command lists the permissions that you have, but it doesn't show you the actual repositories. For this, there is an expand command. (Yes, this is not a great name; it's related to how gitolite was intended to be used. Remember that there is also a help command that describes the available gitolite commands.) When you run the expand command — ssh pltgit expand — you get a listing of all of the repositories that you can access, each with an indication of read permissions (R) or write permission (W). A @ indicator means that you have the respective permissions because it is allowed for all users. Each repository is also listed with its owner, or <gitolite> in case it is a globally configured repository.
Some of the gitolite commands are used to configure your repositories — you can only use these with repositories that you own.
  • getperms and setperms — these are used to get or set permissions for your repositories. The first will print the current permissions (which will be initially empty), and the second will read the permissions on its input and set them. The format of the permissions is simple: each line begins with an R or RW, and then the usernames that this permission applies to. You can use the magic username @all to grant access to everyone in the system. For example, to grant read permissions to everyone, and write permissions to user1, create a file with:
    R @all
    RW user1 user2
    and then run the setperms commands with this as its input:
    ssh pltgit setperms $user/foo < the-file
    You can also just run the setperms command and type in the permissions directly. Note that these permissions are not cumulative: every use of setperms specifies all permissions. (We might have a more convenient interface for all of this in the future.)
  • config — this command can be used to set known configuration options in your repositories. It works with sub-verbs:
    ssh pltgit config list
    Displays the known configuration options
    ssh pltgit config get repo config
    Displays the configuration value of a specific repository
    ssh pltgit config set repo config value
    Sets the configuration value of a specific repository
    These configuration options can customize aspects of the scripts that run after every push — currently, you can use this to set an email address to send notification emails to. Other configurations may be added in the future. (Note that this does not let you set any configuration, since some of these can execute arbitrary commands.)
  • delete — finally, this command can be used to delete repositories. For example, to clean up the above, you can now run:
    ssh pltgit delete $user/foo
    ssh pltgit delete $user/bar
    The repositories are moved to a temporary holding directory, and will eventually be removed. The bottom line here is that if you lost anything by mistake recently, chances are there's a backup of your repository.

Working with git

Working with git: Basics

The above description is much simplified in that it doesn't deal with development that happens outside of your own work — and such development obviously changes the story. Overall, this is not too different from working with subversion: if there were any changes on the server you need to update your working copy first, and this implies dealing with conflicts if there are any. But the way to deal with such things in git is significantly different than dealing with them in subversion, and this difference is at the technical level (different commands) and at the workflow level (you will likely branch much more, and you're likely to push less frequently than you commit with subversion).
To start working, you first need to get a repository clone. Usually you will clone the PLT repository (or a private copy that you do your work in), but remember that to experiment with git you have the play repository or you could make a fork of the PLT repository to play with and remove it when you're done. Either way, be sure to try these things out — it will make your life much easier in the future.
In the following examples I will use an empty repository to demonstrate things and I will list the exact commands that I'm using (this means that I will use unix commands to create and edit files, and use -m when committing). Lines that I enter are displayed with a $ prompt, most output lines are omitted, comments start with #, and $user is set to my username. Note that if you try this yourself, the SHA1s of commits will be different (the reason for that is that a commit object includes the author name and email and the date). Note also that in some places I will “jump to an earlier continuation”: start from an earlier state and do something different — so if you want to try these things out it will be convenient to put the commands in a shell script so you can re-run it to get to the earlier state.
First, I create a private empty repository, populate it, and update the remote repository:
$ user=eli                    # your own username here
$ mkdir /tmp/sandbox; cd /tmp/sandbox
$ ssh pltgit delete $user/foo # delete previous repository, if any
$ git clone pltgit:$user/foo
$ cd foo
$ echo "foo" > foo; echo "bar" > bar
$ git add .
$ git st                      # uses the `st' alias as shown above
A  bar
A  foo
$ git commit -m "initial content" .
[master (root-commit) 87f1f02] initial content
# git tells us the branch we committed to, the new commit SHA1 and
# that this is the first commit, and the log message; we can verify
# this now with `git log'
$ git log
commit 87f1f02c23b32e7f9b...  # this is the commit object I created
$ git push                    # (or `git push origin master' if needed)
To pltgit:eli/foo             # where we pushed to, and the branch
 * [new branch]      master -> master
[A quick note on commit messages: several git command consider the first paragraph of your commit message as a short description for it. This is all of the text up to the first blank line if you write a commit message in an editor, or the first -m message if you use it with git commit (it accepts multiple -m arguments, for multiple paragraphs). Keep this in mind when composing such messages.]
To see what happens when multiple people commit to the repository, we create a second clone of our repository now in a foo2 directory:
$ cd ..
$ git clone pltgit:$user/foo foo2
$ cd foo                      # go back to foo now
Lets make two new commits now:
$ echo "more foo" >> foo; echo "more bar" >> bar
$ git ci -m "more stuff"      # uses the `git-ci' script from above
[master b7d3c41] more stuff
$ echo "even more foo" >> foo
$ git ci -m "even more stuff"
[master 18bc0e6] even more stuff
At this point, instead of blindly pushing these commits, lets look around first. One useful tool for inspecting the history is gitk — if you run it now, you will see the simple 3-commit graph, and two of them are marked as branches — clearly showing that your local master branch is 2 commits ahead of the remote one. This could be different at this point: someone else might have pushed more commits to the remote — remember that your remote master branch (remotes/origin/master) is not really what's on the remote, but rather what you know about it last time you pulled from it.
Another useful command to examine the history is git log, which can show commit history in many ways. As things stand in the current repository, if you just run git log, you will see a listing of the same three commits that gitk shows. To get a more condensed format with one-line-per-commit, use --oneline. Another thing that you can do is show a specific range of commits — you can do this by specifying two revisions separated with .., where the revisions can be written explicitly using the (short prefix) SHA1 form, or more conveniently using a symbolic name (eg, branch, tag, HEAD):
$ git log origin/master..master
Since we are currently on the master branch, we could use HEAD for the second one (origin/master..HEAD), but this is also the default, so an even shorter form is origin/master... In addition to git log, you can also use git diff in a similar way, but instead of a commit listing, you get the diff between the two specified points, so
$ git diff origin/master..
will show you the changes that you did not yet push. Note that there are a number of places where git will guess the full name of branches, for example, origin/master is actually a short name for remotes/origin/master. In a similar way, just origin will make git guess that you're talking about origin/master.
In these cases, the revision specification for log and diff are the same, but this is a little misleading: git diff usually works by comparing two specific end points in your history, but git log actually works on a set of commits rather than on a range. The R1..R2 notation is actually shorthand for ^R1 R2 — specifying a commit means “the commit and all of its parents”, and a ^ prefix negates a set, so ^R1 R2 means “include the set of commits leading to R2 (inclusive), but exclude the ones leading to R1 (inclusive)”.
In addition to this range/set specification, there is a lot to specifying a revision set. As mentioned, you can use a SHA1 (or a shorter unique prefix), or you can use a symbolic name. You can also use R^ for the parent of R (the first parent in the case of merge commits, which have more than one parent), R^^ would be the grand parent, R~3 is the 3rd-generation parent commit. There is also R@{yesterday}, R@{1 month 2 weeks go} etc for a symbolic name R — which refers to the branch/HEAD at that point in time (this refers to your own version of it at that time; there are --since and --until flags to filter commits by the time they were made at). You can also use a -N flag (where N is an integer) to show only N commits. Finally, you can use a branch name R with R@{upstream} (short: R@{u}) to refer to the “upstream” version of the branch — the branch that R is set to follow. This is particularly convenient for things like
$ git log --oneline @{u}..
which will always show the commits that you have over the branch your current one follows. (For example, you could set up an alias to use this.)
We now continue by pushing our two commits to the remote server. Since we already did a push, a plain git push works fine.
$ git push                    # no need to specify a branch now
To pltgit:eli/foo
   87f1f02..18bc0e6  master -> master
As you can see in the last line, git tells us that we pushed from our local master branch to the remote master branch, and that this made it advance from the first commit we pushed (87f1f02) to the last one we created now (18bc0e6).
Since we've made some progress in one place, we can go to the foo2 clone to see what happens when we update a repository that did not have these changes:
$ cd ../foo2
$ git pull
From pltgit:eli/foo
   87f1f02..18bc0e6  master     -> origin/master
Updating 87f1f02..18bc0e6
 bar |    1 +
 foo |    2 ++
 2 files changed, 3 insertions(+), 0 deletions(-)
This looks expected — git shows the new commits that we received, and they're the same as what we pushed earlier. Using git pull is actually doing two things: it's running git fetch first to update your remote branch from the server, and then it uses git merge to merge it into your master branch. The point where get merge starts is the “Updating” line — and there's an important thing to note here: the next line says “Fast-forward”, which is a special kind of a merge. When you merge some branch into your branch, and this branch is a proper superset of your branch (it has commits that your branch doesn't, and all commits in your branch are included in it), git will simply “move your branch forward” to the other: it will update your branch to the tip of the merged one, and then your working directory will be update accordingly.
It is often better to do the fetch first, so you can see the changes that happened remotely before you merge them. To do this we're going to use git fetch, avoiding the merge step that git pull does. In fact, since creating a merge commit is something that you might want to always do, git pull --ff-only will only do the merge if it will be a fast-forward merge.
Assuming we start again from the foo2 repository before the above pull, we get the same output up to the point where the merge started:
$ git fetch
From pltgit:eli/foo
   87f1f02..18bc0e6  master     -> origin/master
$ git log --oneline @{u}..
# nothing
$ git log --oneline ..@{u}
# the same two commits
The first log doesn't show anything, since we have no commits over the ones in the remote (the “upstream” of our current branch). To see this, consider that after expanding empty names to HEAD, and the @{u} to the remote branch name we get remotes/origin/master..master, and this is short for ^remotes/origin/master master — the set of commits made of our master branch and all parents, minus the set of commits from the remote branch and up — since it's ahead of that, we get an empty set. The second log command reverses the two, giving us the set of commits that the remote has and the local branch doesn't. In addition to git log, you can use gitk to inspect the repository: use the --all flag to make it show all branches. Either way you'll be able to see that a fast-forward merge is possible.

Working with git: Concurrent development

Again, we'll assume starting with the foo2 repository before the pull. We will now create a new commit before we get changes. This makes it similar to commits pushed to the server while you do your work — so let's see how this common story goes and try to push this change:
$ echo "blah blah" > blah
$ git add blah
$ git ci -m "blah"
$ git push
To pltgit:eli/foo
 ! [rejected]        master -> master (non-fast-forward)
error: failed to push some refs to 'pltgit:eli/foo'
To prevent you from losing history, non-fast-forward updates were rejected
Merge the remote changes before pushing again.  See the 'Note about
fast-forwards' section of 'git push --help' for details.
As expected, git refuses to push our change. The terminology in the error message is a little confusing — what it basically says is that the commit(s) that we are trying to push are not an extension of the tip of the master branch on the server. A “non-fast-forward update” in this case would mean that we'd set the master branch on the remote to be the same as our branch — but this means that whatever commit that were pushed to the server (the two commits we pushed out from the foo clone in this case) will be lost.
To merge in the remote changes, we need to pull them in. We'll now look at three different ways to do this.

1. Playing it safe

First, as we've seen above, doing a separate git fetch step would allow you to see where things stand before you do anything. Alternatively, we can use the --ff-only variant of git pull, which will do a merge only if it's a fast-forward one, covering the trivial cases. If a fast-forward merge cannot be done, it will tell you about it and then stop:
$ git pull --ff-only
From pltgit:eli/foo
   87f1f02..18bc0e6  master     -> origin/master
fatal: Not possible to fast-forward, aborting.
We can now use the usual tools to see where things stand. The following are all useful here:
  • gitk --all
    This visualizes the commit graph. If you do this, you will see the four commits that we have so far: the initial commit at the root, the commit that we did in this clone, and the two commit that we retrieved and are waiting on remotes/origin/master.
  • (a) git log @{u}..
    (b) git log ..@{u}
    (c) git diff @{u}..
    (c) git diff ..@{u}
    Inspect the commit that the local branch has over the remote (a), and the two that the remote has over the local one (b); look at the difference between the local branch and the remote either way (c).
  • (a) git log --left-right --oneline @{u}...
    (b) git log --left-right --oneline ...@{u}
    (c) git log --graph --oneline --all
    An alternative notation for specifying commit sets for git log is R1...R2 (with three dots) — this stands for all commits from both R1 and R2 and their parents, but excluding commits from their “merge base” — the parent commit that both descend from. As demonstrated in (a), this is especially useful with the --left-right flag: you'll see the commits that are new on the remote branch and the ones that are new on the local one, with < or > indicating which side each commit is coming from. Yet another way that git log can be used is (c) with a --graph flag, which makes it render the commit graph in ASCII-art.
  • git show-branch -a
    This is another potentially useful commands that shows how commits are distributed over branches in your repository. In this case you will see a matrix with the four commits in separate rows, and each will have a + or * indicating whether it is included in a branch.
  • (a) git diff @{u}...
    (b) git diff ...@{u}
    The three-dots notation is also used by git diff, with a slightly different semantics than in git log (remember that git log talks about commit sets, and git diff compares two specific points). In the diff case, these compare a specific branch tip to the merge base of this branch and another, which means that you see a diff with the work done on one branch that is not included in the other (this is unlike the two-dot syntax where you get the diff between the two branch tips). In the first (a) example you will see all changes that you did locally, and in (b) the changes that were done remotely.

2. Merging (not the fast-forward variant)

Now that we did a git fetch or a git pull --ff-only to update the remote branch, we can proceed with merging it into our branch — which we can do in one of two ways:
$ git merge origin            # merge origin/master into master
Merge made by recursive.
# -or-
$ git pull                    # will do a merge as usual
Using merge is better for the usual reason: git pull can bring in more updates that were pushed by others since you fetched, and include them in the merge. Note that if instead of running a separate fetch or pulling with --ff-only you were using git pull, then you'd essentially get to the same point we are now at.
Either way, the merge tells us “Merge made by recursive” — which is important here: if we see “Fast-forward” it means that no new commits were made, but “Merge made ...” means that a merge commit was created (“by recursive” refers to the merge strategy, git has several of them). If we look at the commit graph now with gitk or with git log, we'll see the new commit that was created:
$ git log --oneline --graph --date-order
*   12bf7ee Merge remote branch 'origin'
* | a40a45f blah
| * 18bc0e6 even more stuff
| * b7d3c41 more stuff
* 87f1f02 initial content
The merge was successful without manual intervention (you didn't need to resolve any conflicts), so it proceeded immediately to create the commit that connects the two lines of development, and used some standard template for the commit log. This is the safe thing to do as the default for git, but in this case it makes the history complicated with no good reason — it just happened that while we were working on blah someone else pushed an unrelated change. (It could be related: perhaps one of the remote commits was referring to the file that I've added, but especially in the PLT case this would be very rare.) If you use gitk to look at the recent history of the repository, you'll see many such commits, and they can make it harder to figure out how the development went on.

3. Rebasing

To get a simple/readable history, the goal is to have a more linear history: just have the two remote commits and move our commit to follow them (which would be the history that you get with subversion under similar conditions). But our commit object already points to its parent, so it cannot move: in the above graph, a40a45f is a hash that was computed based on 87f1f02 being its parent, and changing a parent means getting a new hash and therefore a new commit object.
This is where git rebase gets into the picture. Assuming that we didn't merge as described above, we would just use rebase instead:
$ git log --graph --all --oneline
* a40a45f blah                # \
| * 18bc0e6 even more stuff   #  \
| * b7d3c41 more stuff        #   > same tree as before the merge
|/                            #  /
* 87f1f02 initial content     # /
$ git rebase origin
First, rewinding head to replay your work on top of it...
Applying: blah                # git tells us that it re-applies this
$ git log --graph --all --oneline
* 6ebd1fb blah                # \
* 18bc0e6 even more stuff     #  \ the resulting history
* b7d3c41 more stuff          #  / is linear
* 87f1f02 initial content     # /
Here's what git did in this rebase: it (1) moved the HEAD to the merge base between your local branch and the remote one — the 87f1f02 commit; (2) did a fast-forward merge, which just moves your local branch to the tip of the remote one; (3) it now replays the same changes that you had in your commits (only a40a45f in this case) on top of the new tip, leading to new commit objects (6ebd1fb in here). So we end up with a fresh commit object, and the old one (a40a45f) is gone. (It's not really gone — it's kept in your repository store for a while, to protect you from losing work.) If you look at the complete details of the new commit using gitk or git log, you will see that this commit has different dates for the author and the committer date:
$ git log --format=fuller -1 6ebd1fb # all details, show only one commit
AuthorDate: 2010-05-02 10:26:00 -0500
CommitDate: 2010-05-02 10:30:00 -0500
This is because rebasing just created a new commit — but the author time is still considered the same. In any case, you now have linear history that is a proper descendant of the remote branch, and you can now push your changes out.
Practically every place where you read about rebasing in git will warn you about not doing it for public history. The problem is that if someone had a copy of your previous branch (a40a45f), then next time they will update, if their copy of the branch is updated, then things will change in a nasty way: even more so if they've committed more changes on that branch. (This “someone” can also be yourself, of course.) This also explains why git pull does not rebase by default.
As far as the PLT repository goes, server will never allow pushes that are not strict extensions of what's on it (in other words, it only allows fast-forward pushes) — so you won't be able to mess things up for others. But as long as your change is something that you work on privately, there is absolutely no problem in doing this. Note that this is the same thing as with re-doing commits because of mistakes: as long as a commit did not go out, you can fix it in multiple ways; when it does get pushed to the server, the only practical way to fix it is by pushing another commit. (In some extreme cases we may do such a thing: for example, if you commit a passwords file, then there is no other way to remove it completely from the repository — but these are very rare, and such fixes affect everyone.)
It is therefore best to get one of two habits when you do a git pull: either use --ff-only or --rebase. The latter is a little more convenient but you might not feel comfortable about doing a rebase automatically — it might just be that someone worked on the same set of files, and you really prefer a plain merge. For this, you might prefer using --ff-only which will automatically work in the trivial cases, and otherwise leave you in a state where you can look at things and decide how to proceed yourself.

Working with git: Additional forms of history tweaking

As described in the previous section, rebasing is not some kind of a magical operation: it is really just an expected by-product of the way git works — of the fact that commits can be created as descendants of any commit (not just tip commits). You could perform a rebase manually by starting from some commit, then inspect each of the changesets that the rebased history contains, and play them back on the new commit. (Lumping this tediousness lead to a script, which lead to the rebase command.) This means that you don't really have to limit yourself to replaying these commits exactly as they were — for example, you could write new commit log messages, combine two commits into one, drop some commits, or reorder their order.
The rebase command has a flag that makes doing all of these things easy: --interactive. Continuing the above example, we now have four commits in our history — and say that we want to tweak the last two. If we now run
$ git rebase HEAD~2
we ask git to rebase our current head off of its grandparent commit (remember that HEAD^ is the parent, HEAD^^ is the grandparent, and HEAD~2 is an alternative syntax for HEAD^^). Since it is already based there, git rebase does nothing, and tells us that the branch is up to date. But if we add --interactive, we get something different: git pops up an editor with this text:
pick 18bc0e6 even more stuff
pick 6ebd1fb blah
# Rebase b7d3c41..6ebd1fb onto b7d3c41
# ...
This is a listing of the last two commits with their one-line log messages. As the text that is below these lines say, you can replace the pick before a commit with a different command: you can use reword to get to write a different log message (you will get another editor window to do the editing), squash to combine a commit with the previous one (it will let you edit the log message for the combination), fixup which does the same but discards the log message, and finally edit will make the rebasing process stop at the relevant commit and let you tweak it before it continues. In addition, removing a commit line means that the commit will be skipped, and reordering lines will replay the commits in a different order. If any of these lead to conflicts, the rebasing will stop for a manual resolution, and you'll need to git rebase --continue when resolved, or git rebase --abort to get back to the original state.
Note that since our foo2 clone tracks a public foo repository, this particular rebase is bad: we intend to edit the last two commits, but only one is local to foo2 — the other is a commit that we got from foo, and changing it means that we will get a rebased commit based on its parent, and the server will forbid pushing it later. If you see that you went too far when you see the rebase editor, all you need to do is keep those lines untouched: in the trivial case of leaving an initial set of commits in unmodified, they will be “rebased” by leaving them in as is. (If you inspect the history later, you will see that they have the same SHA1s.)
A useful case of using squash (or fixup) with interactive rebasing is doing checkpoint commits frequently, and eventually combining them to a single commit. This demonstrates one a popular git principle: keep commits as logical units that correspond with the changes done, since there is no central server that dictates a public-commit-or-nothing. Doing these will make it easier in the future to deal with the history: inspect the changeset as a whole, undo it, and it also works well with finding bugs in the history — you can have checkpoints for intermediate states of the code even if it doesn't work, since this state will eventually be hidden.
A particularly common case for editing history is “oops commits”: you just made some change, committed it, and then realized that something is wrong — you forgot to change some related reference, to remove some debugging printout, or to describe some new aspect of the commit. You could use git rebase HEAD^ in these cases to rebase just the last commit while editing it, but there is a much more convenient way to do this: git commit --amend. Usually, git commit creates a new commit based on the current branch tip and a given commit log message, but with --amend it does something different: it takes a snapshot of the tree as usual, but it makes the commit be a descendant of the tip's parent commit. For example, assuming we didn't really change anything with the rebase above, our history and recent change is now:
$ git log --oneline
6ebd1fb blah
18bc0e6 even more stuff
b7d3c41 more stuff
87f1f02 initial content
$ git log --oneline -p -1     # -p => show patch, -1 => only one
6ebd1fb blah
diff --git a/blah b/blah
--- /dev/null
+++ b/blah
@@ -0,0 +1 @@
+blah blah                    # this is the recent change
This is obviously wrong — we need to have three “blah”s there. With subversion we would now need to perform an “oops, forgot a blah” commit, and in fact, we would need to do the same with git if we push these change out now. But as long as we didn't, we can fix it without an additional commit using --amend:
$ echo "blah blah blah" > blah
$ git add blah
$ git ci --amend -m "blah^3"
[master 5cf863d] blah^3       # the re-made commit
$ git log --oneline
5cf863d blah^3
18bc0e6 even more stuff
b7d3c41 more stuff
87f1f02 initial content
As you can see, the last commit is gone (remember that it is still backed up, in case of problems), and there is a completely new commit instead. Usually, --amend is used without -m — the log message editor will be initiated with the previous log message, so you can edit it instead of rewriting it from scratch. If there were no modifications to be committed, a git commit --amend is a convenient way to edit the last commit message only. If you leave the message untouched, a new commit will still be made — one with a new commit time; and if you delete the text completely, the re-commit will be aborted, and you will be left with the old one intact.

Working with git: Resetting the tree

Both the commit --amend feature and rebasing build on the ability to “move” the current branch tip to some earlier commit in its history. To do this directly, git provides a git reset command, which can move the current branch tip to a specified commit, and adjust the working directory and/or the staging area accordingly. For example, for the --amend functionality, you will use HEAD^ to move the branch tip to its parent commit. You can of course specify any other commit to move to, and since git branches are effectively short bookmarks, you can create branches to be able to move to them later on (or as targets for rebasing, merging, etc). In addition, you can use HEAD (or just omit the target, since HEAD is the default) to only change the working directory (or staging area).
The reset command has three major modes for its work, specified with a flag. (See the git-reset man page for a more thorough explanation, with lots of usage examples, some have evolved into their own functionality — like the --amend feature.) Using HEAD^ as the target commit, here are some summaries of how it can be used:
  • git reset --hard HEAD^
    This will move the branch tip to the previous commit, and will change the working tree and the staging area to match. Translation: completely forget the last commit and any work in the working directory.
  • git reset --soft HEAD^
    Moves the branch tip, but does not change the working directory or the staging area. Translation: undo the last commit, and leave your working directory in a state where git commit will get the same change in. (Note: not the git ci script — this will add changes in the working directory, if any.)
  • git reset --mixed HEAD^
    Moves the branch tip and the staging area to the parent commit, and any modifications done by the commit that are going to be lost are put in your working directory. Translation: similar to the --soft version, except that the staging area is cleared, so to recommit the changes you will need to add files again (or use git ci as usual).
    (Note: This is the default mode.)
When using HEAD (which is the also default when nothing is mentioned), the branch tip is not moved, and we get:
  • git reset --hard
    Get rid of all changes to the working directory and the staging area. Translation: lose all work that was not committed, getting back to the content on the branch (a convenient way to do something similar to an svn revert -R . in the root of a subversion working directory).
  • git reset --mixed
    Get rid of all changes to the staging area, leaving your working directory intact. Translation: lose everything that was added to the staging area. If you're avoiding it (for example, if you only use the git ci script), then this would be a no-op other than new files that were git added.
  • (git reset --soft is a no-op.)
When git reset changes the HEAD, it creates another toplevel reference name called ORIG_HEAD that points to the previous commit that HEAD pointed at, so if you happen to git reset --hard HEAD^ by mistake, you can immediately get back to it with git reset --hard ORIG_HEAD (but changes in the working tree would still be lost). (git merge is another command that changes the HEAD, and it also saves the previous value in ORIG_HEAD.) Finally, note that git reset can be restricted to make it work only on a specific set of paths, not on the whole repository.

Working with git: Other forms of reverting

While we're on the topic of reverting files, there are three more things worth mentioning:
  • git checkout -- path ...
    When git checkout is given some paths, it will only check out the relevant files from their state in the staging area. This is a more popular way to revert changes to a specific file. If you avoid using the staging area, then this is roughly the same as using reset with the --hard flag on the paths, since your staging area will usually be the same as your HEAD. (Note that the -- is optional, and needed only when a path name can be confused with a branch name.)
    As with reset, you can also specify a branch to check the path(s) from — which is useful to try some files from a different branch selectively. However, note that unlike subversion, git does not remember the association of the branch and the paths that were checked out of it (the branch is not “sticky”) — the files will simply be considered as modified (and they will not be updated when the branch is, unless you do the same checkout).
  • git show HEAD:path
    This shows the file as it exists in the HEAD, making it useful to inspect the file before you made some additional modifications (similar to svn cat path). You can also omit the HEAD — using :path will show the file in the staging area, which will usually be the same as the HEAD. One caveat to note here is that the path should be the full path relative to the repository root. (Note: I have a wrapper git cat script that emulates svn cat, I'll add it if anyone wants.)
  • git revert commit
    The git revert command is used to revert the changes introduced by the given commit. It will basically apply the change in that commit in reverse, then ask you for a log message for a new commit where the message is initially populated with text indicating the commit that was applied in reverse.
    Note that this is very different from svn revert — it is more like
    svn merge -c-123; svn commit "Revert revision 123"
    Since this is a frequent source of confusion, the git-revert man page mentions it at the top, and it refers readers to git reset and git checkout as the way to do the equivalent of svn revert (which are described above.)

Working with git: Dealing with conflicts

We'll now see how to deal with merge conflicts. First, we'll set up the repository for a conflict. Continuing with the foo2 clone, we'll first create a file (which I'll do here using shell commands, to make it easy to play with), commit, and push the new history (which includes the blah work) back to the server. Note the use of git branch -v which shows the local master branch and the fact that there's two commits that we haven't pushed out yet.
$ echo "#lang racket"    > foo
$ echo "(define (foo x)" >> foo
$ echo "  (* x x))"      >> foo
$ git ci -m "turn foo into a library"
[master fd856ef] turn foo into a library
$ git branch -v
* master fd856ef [ahead 2] turn foo into a library
$ git push
To pltgit:eli/foo
   18bc0e6..fd856ef  master -> master
Now hop over to the foo clone, get the changes (the relevant bits of the output are shown), edit the file (using sed, to make it a command line), inspect the change, commit it, and push.
$ cd ../foo
$ git pull
From pltgit:eli/foo
   18bc0e6..fd856ef  master     -> origin/master
Updating 18bc0e6..fd856ef
 blah |    1 +
 foo  |    6 +++---
 2 files changed, 4 insertions(+), 3 deletions(-)
 create mode 100644 blah
$ sed -i '2s/x/[x 0]/' foo
$ git diff
diff --git a/foo b/foo
index 78d9889..b81de80 100644
--- a/foo
+++ b/foo
@@ -1,3 +1,3 @@
 #lang racket
-(define (foo x)
+(define (foo [x 0])
   (* x x))
$ git ci -m 'add a default value'
[master 5035c9a] add a default value
$ git push
To pltgit:eli/foo
   fd856ef..5035c9a  master -> master
And now get back to foo2, and before we pull, modify the same line by adding a comment and commit, then do a --ff-only pull and watch it refuse to merge as expected, then look at the history so far.
$ cd ../foo2
$ sed -i '2s/$/ ; int->int/' foo
$ git ci -m 'document the type of foo'
[master 21a78df] document the type of foo
$ git pull --ff-only
From pltgit:eli/foo
   fd856ef..5035c9a  master     -> origin/master
fatal: Not possible to fast-forward, aborting.
$ git log --graph --all --oneline -4
* 21a78df document the type of foo # ← our change
| * 5035c9a add a default value    # ← the conflicting change we pulled
* fd856ef turn foo into a library
* 5cf863d blah^3
Rebasing is the common thing to do, but let's see what happens with a plain merge first:
$ git merge origin
Auto-merging foo
CONFLICT (content): Merge conflict in foo
Automatic merge failed; fix conflicts and then commit the result.
We now have a conflict that needs to be resolved before we can finish the merge. Using git st (the alias listed above for the svn-like status that git status -s produces) shows a new UU status for foo — this indicates an “unmerged” (conflicted) file. To investigate further, we use a plain git status, which tells us that our history diverged from the remote (we already know that since pull --ff-only failed) and count the diverging commits, and it also tells us that foo is unmerged and hints at using git add to resolve it:
$ git st
UU foo
$ git status
# On branch master
# Your branch and 'origin/master' have diverged,
# and have 1 and 1 different commit(s) each, respectively.
# Unmerged paths:
#   (use "git add/rm <file>..." as appropriate to mark resolution)
#       both modified:      foo
You can also see that git knows about the conflict and refuses to do a commit:
$ git commit
fatal: 'commit' is not possible because you have unmerged files.
Please, fix them up in the work tree, and then use 'git add/rm <file>' as
appropriate to mark resolution and make a commit, or use 'git commit -a'.
In most cases the way to continue is simple: open the conflicted file in your editor, look for the conflict markers and fix the code. Then, as suggested above, use git add file which tells git that the file is resolved, and finally use git commit to commit the result. (Note that using git commit file will not work, which is why the git-ci script avoids adding a . if the tree requires resolving a merge.) I'll simulate the editing part with echos, and then mark it resolved:
$ echo "#lang racket"                   > foo
$ echo "(define (foo [x 0]) ; int->int" >> foo
$ echo "  (* x x))"                     >> foo
$ git add foo                 # ← tell git that it's resolved
And now the last step is to run git commit, which will start your editor to edit the log message — it will be populated by text that indicates the merge and the file that had conflicts, which you can commit as is, or add some text regarding the way it was resolved.
At this point (or before we started working on resolving the conflict), we can get back to the original state using reset:
$ git reset --hard
HEAD is now at 21a78df document the type of foo
This kind of reset is generally useful if you had some problematic conflict to resolve and you want to back up completely and re-try. But now that we've at the start, we will see what happens when we try to rebase with the conflict instead:
$ git rebase origin
First, rewinding head to replay your work on top of it...
Applying: document the type of foo
CONFLICT (content): Merge conflict in foo
Failed to merge in the changes.
When you have resolved this problem run "git rebase --continue".
If you would prefer to skip this patch, instead run "git rebase --skip".
To restore the original branch and stop rebasing run "git rebase --abort".
Obviously, we get a different message (note that git status will now tell you that you're not currently on any branch — a result of being in the middle of a rebase). The process that follows is very similar to the merge case: edit the conflict away, then git add the file. There are two differences: (1) after you git add the resolved files, you should use git rebase --continue instead of committing[*]; (2) if you want to abort the merge, use git rebase --abort instead of using reset. ([*] If you did commit, then it means that you wrote a new log message for the replayed commit, and you can just as well use the --skip flag so rebasing continues with the rest, or you can use reset to undo your commit and let rebase do it for you.)
When you're in a conflicted state, there are a few git tools that help you in the resolution work. The first useful utility is git diff: when there's a conflict, all files that were automatically merged are already going to be in your staging area, and parts of conflicted files that could be merged merged will be there too. This leaves only the conflict regions in your working directory, which means that git diff will show you only the conflicts (since by default it shows differences between the working directory and the staging area). Also, the diff output itself is not a standard one. At the current point of conflict during the rebase that we started, this is what we'll see:
$ git diff
diff --cc foo
index b81de80,86a4c54..0000000
--- a/foo
+++ b/foo
@@@ -1,3 -1,3 +1,7 @@@
  #lang racket
++<<<<<<< HEAD
 +(define (foo [x 0])
+ (define (foo x) ; int->int
++>>>>>>> document the type of foo
    (* x x))
The diff header uses --cc which indicates git's “combined diff format”, used to represent merge commits (any commit with more than one parent). The next line has the two SHA1s of the two files that are merged. The diff itself starts with three @s, and instead of a single indicator character (+, -, or ), there are two — indicating a three-way diff between the two versions and their common ancestor version. In the above you can see that the line with the optional argument is coming from HEAD, and the type-annotated one is coming from its commit. You might notice that this look backwards, since we're in the repository where we committed the type annotation to the HEAD — but we're now rebasing, which means that we start from the remote branch and merge our local changes into it, essentially making the rebase perform merges in the other way than plain merges. The conflict markers themselves are marked as new in both versions, and the labels that follow them depend on available information (in a merge, we would see HEAD and origin).
During a conflict resolution, the staging area actually holds three versions of each file: the common ancestor, our version, and the merged version. These things are called “file stages”, and they can be accessed using a special syntax:
$ git show :1:foo     # the common ancestor of both versions
$ git show :2:foo     # our version (optional argument)
$ git show :3:foo     # merged version (type-annotated)
(Again, remember that this is a rebase, so the last two are swapped.) You can also checkout one of these versions using git checkout foo, giving it an --ours or --theirs flag to specify which version you want to use; and you can use git diff to compare against them. For example, we resolve the file (as above) and then try the different diffs (before we mark it as resolved) — these examples only show the changed lines from each of the diffs:
$ echo "#lang racket"                   > foo
$ echo "(define (foo [x 0]) ; int->int" >> foo
$ echo "  (* x x))"                     >> foo
$ git diff -1 foo               # can also use --base
-(define (foo x)                # original version
+(define (foo [x 0]) ; int->int # new version
$ git diff -2 foo               # can also use --ours
-(define (foo [x 0])
+(define (foo [x 0]) ; int->int
$ git diff -3 foo               # can also use --theirs
-(define (foo x) ; int->int
+(define (foo [x 0]) ; int->int
Finally, git log and gitk accept a --merge flag which shows commits relevant to a merge. With git log the --left-right flag is useful here, since you'll see which side the relevant commits are on. (But this works only in git merge, not in rebasing.)
Again, when you're happy with the resolution, you git add the file, and because we're doing a rebase rather than a merge, use use it to continue:
$ git add foo
$ git rebase --continue
Applying: document the type of foo
$ git log --graph --all --oneline -4
...linear history...
Note that git rebase --continue did the commit of the resolved content for you, and it used the previous commit message you've written. This is a good rule-of-thumb for deciding whether you should rebase or merge: if the commit message are still fine as a description of the modifications, then a rebase is fine; otherwise you might want to merge instead.

Working with git: Copying/renaming files

Git is, by design, tracking snapshots of the complete repository tree. Specifically, it does not keep explicit track of file/directory copies and renames. Instead, it provides ways to infer such changes in the repository based on the content. As a result of this, there are almost no git commands that deal with file movements:
  • There is no git copy command: you just copy the file and add the new one as usual.
  • There is a git rm command, but its purpose is mostly to remove a file from the staging area. You could also just remove the file outside of git, and then use either git commit removed-file or git commit containing-directory to remove it (or using the above script — git ci in the same directory). git rm will delete the file from the staging area so you can do a plain git commit without naming any paths.
  • For the same reason, there is a git mv command — it uses git rm as above to update the staging area, and if you're fine with ignoring it, then you can just rename the file outside of git, and git add the new version — but as we will soon see, it's really best to use git mv to avoid the possible confusion if you want the file's history to be visible.
To try things out, let's properly name the foo library:
$ mv foo foo.rkt
$ git st
 D foo
?? foo.rkt
As you can see, we forgot to git add the new file, so if we commit now we'll only be committing the deletion. An important thing to note here is that when git infers file copying and renaming, it does so only when the operations appear in a single commit. So if we commit this change and later commit a new version with the new file will make it lose connection to its history. As long as you didn't push the new commits out, you can still fix it: simply use git rebase --interactive, and squash the file addition together with the deletion. But let's start over and do the rename the easy way:
$ rm foo.rkt
$ git reset --hard
$ git mv foo foo.rkt
$ git ci -m "properly name the foo library"
to see this commit, we can use git show (which can show arbitrary objects, but with no arguments it shows the HEAD). git diff can also be used to show only the diff part — using the HEAD^! syntax that roughly means the range from the previous HEAD to the current one:
$ git show
...log message...
$ git diff HEAD^!
$ git diff --stat HEAD^!      # shows an overview of the changes
 foo     |    3 ---
 foo.rkt |    3 +++
 2 files changed, 3 insertions(+), 3 deletions(-)
$ git log --oneline foo.rkt
599b3b6 properly name the foo library
All of these show the two operations as disconnected, and the log doesn't show any of the prior history. The thing is that you need to ask git to look for file operations, and the -M and -C flags do that. In addition, git log needs a --follow flag to make it follow history beyond renames (but note that it can do that only when given a single file path). For example:
$ git diff -M --stat HEAD^!
 foo => foo.rkt |    0
 1 files changed, 0 insertions(+), 0 deletions(-)
$ git log --oneline --follow foo.rkt
599b3b6 properly name the foo library
0fb8291 document the type of foo
5035c9a add a default value
In this case the rename was a trivial one as were no other changes. This makes it especially easy to find renames since the SHA1 of the file would be the same. But git considers such operations as renames as long as they're “similar enough” — for example, if you just rename some files and change some requires as a result, it will be detected as renames. (The usual claim is that when the content is not similar enough, you can just as well claim that the file is new.) If you think that you might be doing too many changes to some files, and you want to preserve the connection, you can do only the rename in one commit, and then the modifications in the next.
An added benefit of this mode of work is that git blame can find lines in files that were copied from other files, and deal naturally with a file that is split into two files etc. Like log and diff, it needs some flags to do the extra work (see -M and -C).

Working with git: Managing branches

As seen in various places above, a branch in git is basically just a SHA1 pointer to a commit (and therefore to the whole line of commits in its development line), with a naming hierarchy that follows some conventions (/-separated, master as the main one, remotes prefix for remote branches, origin for the default remote server name, etc). You can see all of this in the toplevel .git meta directory — there is a HEAD file which represents the head, its content will be a line that looks like ref: refs/heads/master, and there will be a refs/heads/master file with a content that is the actual SHA1. There are, of course, various other bits of meta-data, so it's not a good idea to change such files directly (for example, when there are many names git will create a “packed” reference file with many references for efficiency) — but overall this is the basic idea.
Branches come in two main kinds: local branches and remote ones, with remote branches having a name that begins with remotes/origin/. (Later we'll see how to add new remote repositories — remote branches from these will have names that start with remotes/remote-name/ instead.) The difference between the two is that a remote branch is a way to mirror a branch on a remote repository — it is not intended for local work. For example, if you try to check out a remote branch, git will check out a “detached head” (details on this below). If you do that, you'll see that the HEAD file will have an explicit SHA1 rather than the usual ref: branch-name.
The git branch command is the main way to manage branches. With no flags, it will just print out the list of local branches, marking the current branch with a *. You can add flags to show remote branches instead (-r), both kinds (-a), and also to list more information on the branches (-v):
$ git branch
* master
$ git branch -r
  origin/HEAD -> origin/master # (this one is symbolic too)
$ git branch -av
* master                599b3b6 [ahead 2] properly name the foo library
  remotes/origin/HEAD   -> origin/master
  remotes/origin/master 5035c9a add a default value
When given a single name argument, a branch by that name will be created, and it will point to where the HEAD currently points to; a second argument can be a name of an existing branch (or any commit) that the new branch will start at. In addition to creating branches starting from the current head, this can be useful in creating branches that start from elsewhere, even from a “detached head”. For example, say that in our current repository we want to try out some work based on the state of things before the last commit (which renamed the foo file). We can check out HEAD^ (which will lead to a detached HEAD), and then create a branch for it:
$ git checkout HEAD^
You are in 'detached HEAD' state.
HEAD is now at 0fb8291... document the type of foo
$ cat .git/HEAD
0fb8291...                     # doesn't point to a branch
$ git branch
* (no branch)                  # you can see it here too
$ git status
# Not currently on any branch. # and here
$ git branch pre-rename        # create a branch here
$ git branch
* (no branch)                  # we're still detached
$ git checkout pre-rename
Switched to branch 'pre-rename'
As you can see, creating a branch doesn't check it out — even when the new branch is exactly where we already are. The difference is related to the nature of HEAD: it is usually an indirect reference to a branch name, and when a commit is made, the branch that HEAD points to is updated. But when we are using a detached HEAD, it points directly at a SHA1 — committing in this state will work, and the HEAD will point at the newly made commit — but there will be no branch that will be updated, so if you checkout a different branch (or a different commit) now, the commits you made are “lost”.
The main reason that such commits will be lost is that git branches don't live inside the repository store — and dealing with branches is not something that gets recorded as part of the history. To make things safer, git maintains something that is known as the “reflog”, which keeps track of where your branches have been — those are kept for a while (usually around a month), which means that you can easily go back to a previous commit if it seems that you lost one (eg, as a result of committing on a detached HEAD). (You can see these files in the .git/logs directory.)
Since creating a new branch and checking it out is a common combination, the checkout command can create a branch before checking it out. Use the -b flag for this:
$ git checkout -b also-pre-rename
Switched to a new branch 'also-pre-rename'
$ git checkout -b post-rename master
Switched to a new branch 'post-rename'
$ ls
bar  blah  foo.rkt
$ echo "one more line" >> foo.rkt
$ git ci -m "one more line"
Finally, you use the -d flag to delete branches.
$ git branch -d post-rename   # won't allow it
error: Cannot delete the branch 'post-rename' which you are currently on.
$ git checkout master
Switched to branch 'master'
Your branch is ahead of 'origin/master' by 2 commits.
$ git branch -d post-rename
error: The branch 'post-rename' is not fully merged.
If you are sure you want to delete it, run 'git branch -D post-rename'.
As you can see, git refuses to delete a branch that has unmerged work, since this can lead to losing that unmerged work — so you need to use -D for that. In addition, you usually don't delete remote branches, when you do, you need to use the -r flag too.

Working with git: Using branches

Since git branches are so light weight, they fit any kind of parallel work you need to do on several different topics. A result of that is that it is possible to start a new branches for any work you'd want to do — and this is common enough that there's a name for such branches, they're called “topic branches”. Such branches are created from the master branch (usually) and worked on in parallel. At any point where you want to work on something new, you would create a new branch for it and switch to it (committing any work you might have on your current branch before you do so):
$ git checkout -b improve-bar master # switch to a fresh topic branch
Switched to a new branch 'improve-bar'
$ echo "even more bar" >> bar        # work there
$ git ci -m "improved bar"           # save that work
$ git checkout post-rename           # go back to where we were
If you need to commit changes before you create the new branch, you shouldn't have any problems doing so — because you can change where a branch points to, you can just commit whatever you have and then get back to it:
$ echo "another line" >> foo.rkt
# at this point you remember that you need to do something else in the
# `improve-bar' line of work.
$ git ci -m "checkpoint"
$ git checkout improve-bar
$ git checkout post-rename
$ git log --oneline -2
e9a4fcd checkpoint            # this is our temporary checkpoint commit
d92fb0a one more line
$ git reset HEAD^             # undo it
Unstaged changes after reset:
M       foo.rkt               # git tells us that this is now uncommitted
$ git st
 M foo.rkt                    # ... as does `status'
$ git log --oneline -2
d92fb0a one more line         # the temporary commit is gone
599b3b6 properly name the foo library
You can even decide on some convention to use in some cases, then create new git commands as scripts that will do the work for you. In this case, you could write a command that will do a “checkpoint” commit if needed, switch to another branch, and if the first commit there has only checkpoint as its log message, undo it as above. There are several git convenience commands that started out this way — in this case, checkout the git stash command which allows you to save the current work by pushing it on a “work in progress” stack, and later pop it back out (possibly on a different branch).
Earlier we've seen how to merge or rebase your master branch from the remote master branch, but the full story is that you can merge and rebase any two branches. This makes branches very flexible: you can create a branch A from an existing branch B, eventually merging/rebasing it back into A, or directly into master and dump A. At any point you can run gitk --all to see where things stand — in our current repository, this shows us that there are redundant pre-rename and also-pre-rename branches, that out master branch is two commits ahead of the remote one, and that we have improve-bar and post-rename branches with 1 and 2 commits over our master branch. If we're done with these two branches, we can now merge/rebase them to our master, or merge/rebase one to the other and the result to master, and then push everything out.
To make working with branches even easier, git has a notion of an “upstream branch” — this is a per-branch setting that tells git which branch the current one is based on. By default, any branch that is created with a remote branch as its starting point will have that remote branch set as its upstream. We've seen how git treats that information in various places so far: git status and git branch -v both use it, and using a second -v with the latter shows also the upstream branch:
$ git reset --hard            # dump the above uncommitted change
$ git checkout master
Switched to branch 'master'
Your branch is ahead of 'origin/master' by 2 commits.
$ git status
# On branch master
# Your branch is ahead of 'origin/master' by 2 commits.
$ git branch -v
* master    599b3b6 [ahead 2] properly name the foo library
$ git branch -vv
* master    599b3b6 [origin/master: ahead 2] properly name the foo library
In addition to that, we've seen the @{upstream} and @{u} notation that refers to the upstream branch, making it convenient to further examine pending changes that weren't incorporated upstream:
$ git log --oneline @{upstream}..
599b3b6 properly name the foo library
0fb8291 document the type of foo
And finally, git pull and git push know where to pull from and push to based on this setting. Overall, this is a very useful feature to have when you have many branches, therefore it is possible to use it between local branches too. There are two ways to do this: when a branch is created with either git branch B or git checkout -b B, you can use the --track flag to set up tracking to the initial branch it's based on.
$ git branch -t b1 master
Branch b1 set up to track local branch master.
$ git checkout -tb b2 master
Branch b2 set up to track local branch master.
Switched to a new branch 'b2'
(Note: if you're using checkout, then the --track flag should precede the -b flag, as done above.) If a branch already exists, you can use git branch --set-upstream to set the upstream information.
$ git branch --set-upstream post-rename
Branch post-rename set up to track local branch b2.
$ git branch --set-upstream improve-bar master
Branch improve-bar set up to track local branch master.
As seen here, if it is given just a branch name, the current branch is set as its upstream. git branch can also change the upstream branch, for example, if the above tracking of b2 was a mistake:
$ git branch --set-upstream post-rename master
Branch post-rename set up to track local branch master.
Either way, we can now see this information in the git commands that do so, as well as use @{upstream}:
$ git branch -vv
  b1          599b3b6 [master] properly name the foo library
* b2          599b3b6 [master] properly name the foo library
  improve-bar e60c168 [master: ahead 1] improved bar
  master      599b3b6 [origin/master: ahead 2] properly name the foo @;
  post-rename d92fb0a [master: ahead 1] one more line
$ git checkout improve-bar
Switched to branch 'improve-bar'
Your branch is ahead of 'master' by 1 commit.
$ git log --oneline @{upstream}..
e60c168 improved bar
In addition, we can use git pull to get changes on the upstream branch merged or rebased on the current one:
$ git pull
From .
 * branch            master     -> FETCH_HEAD
Already up-to-date.
Nothing actually happened here, because the current branch (improve-bar) already contains all of the commits on the master branch. You can see that this is a local pull since git says From ., which stands for “our own repository”. You can also do a push now, which will make the current additional commit (listed with @{upstream}..) appear on the master branch:
$ git push
To .
   599b3b6..e60c168  improve-bar -> master
Since the improve-bar line of development is unrelated to the one in post-rename, it is now one commit behind the master branch, and cannot be pushed as is:
$ git checkout post-rename
Switched to branch 'post-rename'
Your branch and 'master' have diverged,
and have 1 and 1 different commit(s) each, respectively.
$ git branch -vv
  improve-bar     e60c168 [master] improved bar
  master          e60c168 [origin/master: ahead 3] improved bar
* post-rename     d92fb0a [master: ahead 1, behind 1] one more line
$ git push
To .
 ! [rejected]        post-rename -> master (non-fast-forward)
error: failed to push some refs to '.'
To prevent you from losing history, non-fast-forward updates were rejected
Dealing with this is similar to dealing with updates on the remote server — for example, we can rebase the branch before pushing it:
$ git pull --rebase
From .
 * branch            master     -> FETCH_HEAD
First, rewinding head to replay your work on top of it...
Applying: one more line
$ git push
To .
   e60c168..7bdec0c  post-rename -> master
When you use git push to push changes when you have no upstream branch set, or when you push to a different branch than the one set, you can use --set-upstream to make git remember the push target as the upstream. Therefore, an easy way to create a new branch that tracks a possibly new remote branch by the same name is:
$ git checkout -b my-branch
Switched to a new branch 'my-branch'
$ git push origin my-branch --set-upstream
To pltgit:eli/foo
 * [new branch]      my-branch -> my-branch
Branch my-branch set up to track remote branch my-branch from origin.
And when you deal with remote branches this way, you might want to have a local branch that tracks a remote one with a different name. To do this, you use a syntax for the branch to push that specifies the local branch to push and the remote one to push to:
$ git push origin my-branch:different-branch --set-upstream
To pltgit:eli/foo
 * [new branch]      my-branch -> different-branch
Branch my-branch set up to track remote branch different-branch from origin.
Finally, note that git stores the upstream information in the repository-local configuration file. If we look at it now, we will see the various upstreams that we have set:
$ cat .git/config
[remote "origin"]
        fetch = +refs/heads/*:refs/remotes/origin/*
        url = pltgit:eli/foo
[branch "master"]
        remote = origin
        merge = refs/heads/master
this is the upstream that was made by default when we first checked out our clone, together with the information of where the origin repository is. Following that are the ones we've setup later:
[branch "b1"]
        remote = .
        merge = refs/heads/master
[branch "b2"]
        remote = .
        merge = refs/heads/master
[branch "post-rename"]
        remote = .
        merge = refs/heads/master
[branch "improve-bar"]
        remote = .
        merge = refs/heads/master
[branch "my-branch"]
        remote = origin
        merge = refs/heads/different-branch
Note that there are branches that track local branches (ones with a remote = . setting), and ones that track remote ones; and also note that the my-branch branch tracks a remote branch with a different name. Since the settings are stored as configurations, it is possible to inspect and change them using git config, or even edit the config file directly.
$ git config
$ git config

Working with git: Managing remotes

The distributed nature of git means that you can interact with multiple remote repositories. You could have work done with other people done locally, where people push/pull from each other's clones (possibly by sending around patches, as described below), and eventually when the changes are ready push them back to the main repository. You can even have your repository track multiple unrelated remote repositories, essentially giving you branches that have unrelated histories.
By default, when you clone a remote repository git names it origin — and that name appears in many places, most notably in remote branch names. As seen in the above config, git remember where the origin repository is via a configuration:
$ git config remote.origin.url
$ git config remote.origin.fetch
The first one is the url of the remote repository, and the second one is which branches we want to get from it. As with branches, you can use git config to change this information, or you can edit the file directly, but there is a command that does this more conveniently, keeping things consistent:
$ git remote                  # lists all known remotes
$ git remote -v               # remotes and push/pull specs
origin  pltgit:eli/foo (fetch)
origin  pltgit:eli/foo (push)
$ git remote show origin      # see a detailed description
* remote origin
  Fetch URL: pltgit:eli/foo
  Push  URL: pltgit:eli/foo
  HEAD branch: master
  Remote branches:
    different-branch tracked
    master           tracked
    my-branch        tracked
  Local branches configured for 'git pull':
    master    merges with remote master
    my-branch merges with remote different-branch
  Local refs configured for 'git push':
    master    pushes to master    (fast-forwardable)
    my-branch pushes to my-branch (up to date)
The git remote show variant will actually query the remote repository for its state (using git ls-remote) by default, and tell you when a local branch that tracks a remote one is out-of-date.
There are a few more sub-verbs for the git remote command which you can see on the git-remote man page, the most important one is for adding a remote: git remote add short-name url. This is especially convenient if you want to have a fork of the plt repository, with most interaction happening against it, but occasionally pull/push updates from/to the main repository.
Of course, remember that you don't need to add remotes to push and pull from them. You could do the same by explicitly specifying a url for the repository you want to interact with. For example, you could have repositories in different accounts on different machines, and synchronize your work between them by pushing and pulling directly from one repository to another. (Reminder: if you do this, then you're likely to have “checkpoint commits” — when you're done with the work, you can do an interactive rebase, and squash these checkpoint changes back into logical commit.) But if you do this often enough, you will likely find it more convenient to add a named remote.

Working with git: Using private repositories

A particularly useful use-case for adding a new remote is when you want to have private work done in your own fork of the plt repository. Such a mode of work is not strictly necessary — you could just do your work in your repository in a long-lived branch, but there are certain cases where working with a private repository on the server might be more convenient. For example, you might want to collaborate with someone else (that has access) via the server, or you might use a private fork of the plt repository as a central point for synchronizing work from clones on different filesystems as described at the end of the last section (the difference from that is that you basically use the PLT server as your synchronization point). Other than having the main repository reside on the PLT server, working with a private repository is not different than working with any other repository.
There are two facts that are worth reminding when you deal with a private repository. First, remember that creating a private fork is cheap: creating a new git clone of a repository will use hard links to the repository store object, most of which will be contained in large packed files. The cost in terms of space and time for creating a new clone is therefore minimal when done on the same filesystem — and using the gitolite fork command is doing just that. Please use the fork command to create a private clone — gitolite has a feature where it creates any repository that you refer to (as long as it has a name that you're allowed to create — starts with your username); this means that you could clone the main plt repository and push from it into a private repository that doesn't exist: it will be created, but such a copy will not share storage with the main repository — it will require a new copy, and it will be slow to create.
The second thing to remember is that due to the nature of the git store, any object, including commits, is stored exactly once. Since commits contain their parents, having a specific commit means that you have its complete history — therefore, pulling in any branch from any repository will always require getting only commits that you don't already have. As a result, pulling and pushing to/from any repository will be efficient and move around only those commits that are missing on the other side.
You can choose one of two basic approaches to working with a private fork: you can have the public repository cloned but have branches pushed to your private one, or you can have your private fork cloned and occasionally push updates to the public one. A way to use these two approaches are described and explained now. These examples use the play repository as an example (which you are encouraged to experiment with). Note that you can use a hybrid approach: you can think about a repository as a container for commit histories, pushing and pulling from any other repository, including the copy you're working with, the main plt repository, or a private fork (your own or another). Note also that since forks are cheap, you can keep several of them around, for example, you can have a fork for each long-lived branch — it's up to you to settle on a layout that is convenient for your work.

Using a clone of the public repository, pushing branches to your private one:

  1. Setting up:
    • Create a fork:
      ssh pltgit fork play $user/play
    • Get a local copy of the main repository:
      git clone pltgit:play
      (or continue working in an existing one)
    • Set up a convenient name for your private repository:
      cd play
      git remote add my-fork pltgit:$user/play
  2. To start working on a private branch, create one, and push it to your private repository:
    git checkout -b my-branch
    Then use push to create this branch in your private repository, with --set-upstream so git will remember this setting:
    git push --set-upstream my-fork my-branch
    You can also have your branch named differently in your fork, for example:
    git push --set-upstream my-fork my-branch:master
    will save your branch as the master branch in your fork. This might be convenient if you want to clone your private repository elsewhere and work only on this branch.
  3. You can now work as usual in your repository, pushing/pulling changes to/from the master branch will go to the public repository, and doing so from my-branch will go to your private fork. You can merge changes on the master branch to your private one, or rebase your branch onto it. However, note that the server will not allow pushing a rebased history to your clone. (More details at the end of this section.) You can bypass that by pushing to a new branch while keeping your local branch name:
    git push --set-upstream my-fork my-branch:my-branch-2
  4. When you're done merge your branch (possibly rebasing it first) to the master branch, and push as usual.
  5. If you want to delete branches on your fork (either because you pushed a rebased version under a new name, or because you're done with that line of work), use
    git push my-fork :my-branch
    Using an empty branch name for the local branch that you push is the way to delete remote branches. (As with local branches, this might lead to losing commits, so be careful. As always, git has a few safety mechanisms in place, so even if did this by mistake, it is very likely recoverable.)

Using a clone of your private repository, pushing changes to the public one:

  1. Setting up:
    • Create a fork:
      ssh pltgit fork play $user/play
    • Get a local copy:
      git clone pltgit:$user/play
    • Set up a convenient name for the main repository:
      cd play
      git remote add -t master main pltgit:play
      (-t master tells git to have only the master branch retrieved.)
  2. Now you can work in this repository as usual — edit, commit, push, etc.
  3. To push changes to the main repository, first make sure that you're on the branch with the changes that you want to push, and then:
    • Get the recent tree from the main repository
      git fetch main
    • Rebase or merge your changes with this:
      git rebase main/master
      git merge main/master
    • Push the changes back:
      git push main
    Note that rebasing your branch on top of main/master means that it will be rewritten, which means that you will not be able to push your branch back to your clone. This is because rewritten histories are currently forbidden by the configuration, but this will probably change in the future. Still, even if the server would allow pushing a rebased history it (you will need to use -f to force such a push), you would need to deal with the rebased branch in other clones you might have. Because of this, a rebase is fine if you're done with the work that you're pushing, otherwise, a merge is more convenient.

Collaborating with others

Git makes it very easy to collaborate with anyone, anywhere. You should think about repositories as being parts a network which can be synced in any topology that is convenient for you. In the case of the PLT repository, the main repository on the git server is the central point where the official repository lives, and people who can push are directly syncing content into it. People who cannot push directly do so through someone who can, by sending out patches or “pull requests”. The same applies for any repository, of course, including private repositories, even ones that you maintain yourself independently of the PLT server.
In the case of a patch-based workflow, the two sides that are involved are the patch author, and the receiver that integrates it into his/her own clone (and from there it goes to the main repository as usual). The work that each side does is described in the next two subsections.
Following that there is a description for making your repository public, which you will need if you're working on a private repository, but it is also useful for your collaborator to do so you can use a pull-request workflow. In this mode there is no need to email patches; instead, both people make their repositories readable to each other, and when some work is ready on one person's repository, the other pulls the commits. This is described in the last subsection.

Collaborating with others: Patch-based workflow
— instructions for the patch sender side

Executive summary:

  1. Work in a plt repository clone (possibly in your own branch)
  2. $ git send-email origin/master
  3. You're done — thanks!
  4. When the patch is applied, you will get the changes through origin/master, so if you worked on your master branch, make sure to use git pull --rebase which will notice that your changes were applied; if you worked on a branch, then you can now delete it (the commit objects will be different from the ones you've made).

Longer version:

  1. Work & commit as usual. In general, it is a good idea to use Signed-off-by: Your Name <your@email> in commit messages, which is a conventional way to declare that you agree for your work to be released as part of the PLT project, under the terms of the LGPL. git commit will add that for you if you use the -s flag. You can also make git do this later, when you send the patches out.
  2. Make sure that you're working with a relatively recent clone, and that you're on the branch where you did your work. In most cases, this would be the master branch, but you can do your work in your own branch too, of course.
  3. Verify that your commits are all in your history. You can see the commits that you have over the plt history with
    git log --oneline origin/master..
    these are the commits that you're going to send over now. (You can use the usual git toolset to tweak them further, or specify only some commits, etc.)
    A relevant point to consider here is that git takes the first paragraph of each commit message as a subject line. When sending out a patch, this is made concrete by actually using it as the emails's subject — so it is a good idea to make sure that this log looks fine, since the --oneline option will make it show those subjects.
    Obviously, you should also make sure that the commits have clear descriptions of your work. People who in the core group often have some general context that they are aware of, so some commit messages can be cryptic or even worse (eg, you might find . as a commit message) — don't mimic this... As a more occasional contributor, you should explain your work in more details. (There's no policy on commit messages, but you do need to go through some person on the team.)
  4. At this point you should decide how to send your patches. Emailing them is be the most convenient way to do this — to do this, you would use the send-email command:
    git send-email origin/master..
    or if you send only some commits, use a different specification. To make things even easier, a single commit specification is considered as the starting point and all of the following commits (up to your branch's tip) will be included in the emails (in contrast to other git commands like log, where a single commit name is considered as the set of commits leading up to it) — so you can do this:
    git send-email origin/master
    This will ask you a bunch of questions — it's easy to answer but you can also specify them as command-line options. if you intend to do this frequently it might be a good idea to make it easier with some settings in your global .gitconfig file. For example, I have these settings:
      from = Eli Barzilay <>
      bcc =
      suppresscc = self
    and you can see more in the git-config and git-send-email man pages. The address to send the patches to is also configurable — you can use something like
    to =
    to =
    depending on who you send your patches to — but this is better done as a repository-local configuration option (or just use the --to flag).
    You can add a -s flag to the command, to make git add Signed-off-by lines to commit messages. (See above for what this means.)
    If you want to send the files in some other way (eg, send them all packaged in an archive as attachment[*]), then just use format-patch instead of send-email — git will create a number of patch files in your current directory, which will be named NNNN-text.patch where the text is made out of the subject lines of the commit messages (the first line). You can even run
    git format-patch origin/master --stdout > my-patch
    to concatenate them all and send the resulting file over.
    [*] Note that doing this means that it is not as easy to read your patch, so avoid doing this if you want to make it easier to read and accept it. On the other hand, if you're working with someone specific, they might prefer attachments (for example, it's easier to save the attached file from gmail).
  5. Once the commits have been pushed to the main repository, you would get them when you pull to update. The commits will now be different objects than the ones you have — since the information changed (at least the committer information will be different, the log message might have been edited, etc). If you made your commits on your master branch which is set to track the plt master branch (the usual setup), then make sure that you run git pull --rebase to update — this will identify the commits as already included and will not include them in the rebased master. But if you made your commits in a private branch, and assuming that you didn't do any additional work there, then you can now just delete that branch. (If you did do more work there, then you should rebase it, to avoid resending the same patches again.)

Collaborating with others: Patch-based workflow
— instructions for the patch receiver side

Accepting patches that were sent via email (on any other way), is also simple. The command to do this is git am, which expects an argument that is a mailbox file holding the patch emails, or you could run it and pipe a patch email into it.

Executive summary:

  1. $ git checkout master
  2. Save (unmodified) patch emails into a mail folder file.
  3. $ git am -3 the-mail-folder
  4. Push the changes up to the server

Longer version:

  1. While you will not be author of the commits, you will be their committed, so you should of course be aware of the changes, and be willing to maintain the new code and other work that is implied. So the first step that you should do is review the patch and make sure that you are willing to accept responsibility for it.
  2. Save the patch emails to a mail folder (usually a file). You must take care to save the emails as is, including the date, author, and subject headers, and avoiding text that could have been butchered by your email client. For example, if you're using gmail, then use the “show original” option to view the raw email text, and save that to a file (even in this format gmail will have a first line with a bunch of spaces — it's best to remove that). Otherwise, gmail does things like wrap lines, replaces spaces by non-breaking spaces, or remove spaces. Alternatively, extract patch files from an archive if that's what you received, or save a single attachment file etc.
  3. In your repository clone, make sure that you're on the branch that you want to integrate the changes into. You could do this in your master branch, or in a new topic branch (especially if there is more than one patch).
  4. Run git am -3 mail-folder (am stands for “apply-mail”) with the mail folder that you've created above. It will apply the patches and commit them one by one. Like git rebase, if there are conflicts the process will stop so you can resolve it — and then run the am command with --continue, or --skip this commit and continue with the rest, or --abort to go back to the start. The -3 flags tells git that if a conflicting patch comes from the above format-patch, and it specifies files that we have, then try a 3-way merge — this will make things generally better (and it can identify more patches that were applied, instead of showing them as conflicts).
    You can also use an -i flag to the command to get an interactive version — for each commit it will ask you what to do with it, and let you edit the log message.
  5. Finally push the commits as usual.

Collaborating with others: Making a private repository publicly available

If you're working with “outside people” (people with no accounts on the PLT server, and no direct file-system access etc) on a private repository, you will need to find some way to make your repository available for cloning. An easy way to do so is to put it on a filesystem that those people can access — eg, if you're all in the same department. Another easy way to make a repository available is to find a hosting service like github and others — there are many options here, some are free but limited, and some cost money; if you prefer this easy solution, keep in mind that you can pay for the duration of the collaboration and at the end you can simply keep your repository clone to yourself (eg, if you're working on a paper then there's no need to pay once all work is done).
But if you want to do it yourself, the quickest and most convenient way to make a repository public is to put it in a directory that is available on the web. Such repositories can be cloned directly from the URL the repository is available at — there's no need to setup a server in a special way, and no need to run cgi scripts.

Executive summary:

  1. $ git clone --bare your-repo ~/public_html/repo.git
  2. $ cd ~/public_html/repo.git/hooks;
    mv post-update.sample post-update;
    chmod +x post-update
  3. $ git remote add public ~/public_html/repo.git
  4. Tell people to clone from http://some.where/~you/repo.git
  5. Work, apply email patches, and: git push public

Longer version:

  1. Make a “bare” repository — this is a repository that has no working directory:
    git clone --bare your-repo repo.git
    This will create a repo.git directory holding the bare repository. You should use some path in a directory where you have web pages published.
  2. The URL where the directory is found at is what other people should use when cloning.
  3. You can now push to this repository, and other people will see it too. To make things easier, you can set a remote name for this repository, so it's easy for you to push changes to it.
    git remote add public ~/public_html/repo.git
    And now you can use git push public. (You can also pull from it, but since you're going to be the only one who pushes into it, that will not be necessary.)
  4. One thing to be aware of is that while a repository can be published through HTTP this way, git considers that a “dumb protocol” (because there is no proper interaction between the two sides). To still make cloning possible, you will need to maintain some meta-files that hold entry points to the objects in your repository — to get this, run:
    git update-server-info
    You need to run this after each update to the repository — and to automate this you can have a hook do it for you. In the bare repository you will find a hooks directory with a file called post-update.sample — simply rename this file to post-update, and make it executable with chmod +x post-update. From now on every push to this repository will run the hook and keep the meta files updated.

Collaborating with others: Pull-request workflow

A possibly easier way for people to contribute work is to make their repositories available somehow. In the case of a private repository, the two sides can be in a shared file system, with read permissions for each other; or achieved as described in the previous subsection. In the case of contributing to the plt repository, the contributor can maintain a public fork of the plt repository (eg, by forking the plt github mirror at directly on github).
In this workflow there is no need to mail patches — instead, the receiver simply pulls them directly from the sender's repository. For example, someone tells you that they have some new commits in a foo branch of their repository. Since this is a repository that you can access, and since it shares history with yours, you can just pull that branch in, for example:
git checkout -b someones-work
git pull someones-repository-url
Note that the pull will merge the changes, creating a merge commit if your master branch cannot be fast-forwarded. To avoid this, you can use fetch instead:
git checkout -b someones-work
git fetch someones-repository-url
Either way, this fetches the remote repository's HEAD. You can create the branch in a single fetch command by specifying the remote branch name, and the local branch to fetch into, for example:
git fetch someones-repository-url master:someone
If you expect to do this often (eg, you're going to suggest fixes for the work and get new work in), then you can add a someone remote to be used more conveniently:
git remote add someone someones-repository-url
git fetch someone
git checkout -b some-branch someone/some-branch
possibly using -t to make the branch track the remote one:
git checkout -tb some-branch someone/some-branch
Note that there is no need to create a branch before the fetch, since it will be fetched to a remotes/someone/master branch.
Once you pulled in the branch, you can inspect the changes, merge them, rebase them, etc. The important point here is that you have a copy of the contributed line of work, which you can use with the usual git toolset.
When/if you're happy with the changes, you can simply integrate them to your master branch, and if this is in a clone of the plt repository, then at this point you can simply push these commits to the main server. Once that happens, the contributor can update their own clone, and continue working as usual.
Git has a tool that makes this mode of work a little more organized and robust for the contributor: git request-pull. This simple command (surprisingly, it has no flags) expects a commit that marks the start of the new work (actually, the last one before it, eg, origin/master), and the url of the repository. For example:
git request-pull origin git://
Of course, the contributor doesn't have to work directly in the available repository — in the case of github or with an over-the-web setup like the one described in the previous subsection the public repository is a bare one, and no work can be done directly on it. So what actually happens is: the contributor works on his/her own repository, pushes changes to the public one, and then requests a pull.
The request-pull command will therefore check that the new commits are indeed available at that location, and find out the branch that they're on (in case it's different than the branch that someone is working on). It then prints out a “pull request” text with a description of the changes, the url that was specified, the branch name with the new work, and a summary of the files that were changed. In short, all the relevant information is there, and it even verified that the commits are indeed available — merging them in is now easy.
(As a sidenote, you can use . as the url: git request-pull origin ., and get a condensed summary of your changes.)

Collaborating with others: Pull-request workflow
— recipe for the sender side

  1. Clone the plt repository and work with it as usual, commit your work
  2. Make your repository publicly available
  3. $ git request-pull origin your-repository-url
  4. Send the resulting text to
  5. You're done — thanks!

Alternatively, you can fork the plt repository on github:, commit, then do a pull request. Note: it is better to send a note about your pull request to, or you can do the pull request directly with git as listed above (using github to have a public repository).

Collaborating with others: Pull-request workflow
— recipe for the receiver side

This recipe is for getting some remote work in as a one-time job. If you need to cooperate more closely with someone, you will want to add the remote repository with git remote as shown above.

  1. Get a plt clone, or use your own (it's safe to do the latter, no need for a new clone unless you're paranoid):
    git clone pltgit:plt
    cd plt
  2. Get the foreign repository's master branch (or any other branch) into a local branch:
    git fetch remote-repository-url master:foo
    This pulls the master branch of the remote repository into a local foo branch (you can use other names, of course).
  3. Inspect the changes as usual
    git log    # new commits
    git diff  # changes
    git log -p # both
    (See above for more details on these.)
  4. If you're happy with the change and want to get it as-is, you can simply merge the branch:
    git merge foo
    But unless the remote work was done from the point your master points at (i.e., there were no new commits), this will generate a merge commit that might not be desired. To avoid it, you can rebase the branch against your master and then do the merge (which will now be a fast-forward) merge:
    git checkout foo
    git rebase master
    git checkout master
    git merge foo
  5. You no longer need the foo branch, so delete it with:
    git branch -d foo
  6. Push things back as usual

Collaborating with others: Merging github pull-requests

Github is popular enough that some people prefer to work with a github fork of the PLT repository, and then send a pull request. Merging these pull requests can be done as with any other repository, as explained in the previous section. However, with github there is an easy way to deal with it.
A pull request has a URL like which you can use in your browser to inspect the changes. To apply the changes locally, a convenient feature is that you can add a .patch suffix to every pull request URL which will have a text version of the patch. This means that applying the patch is particularly easy on the command line, for example:
curl | git am
will fetch the patch text and apply it (and you can now push as usual, or locally inspect the ptach and possibly edit it in the usual ways).

Additional Resources

Quick and short:
Basic description of what makes a git repository
Cheat sheets:
Quick reference thing, with links to the git man pages and the progit book
Really short
Explains some more
Short, intended for printing
subversion->git crash course
Nice summary of a few things, but too verbose or too advanced in some places, and also a little outdated.
The git community book. Also, there are a bunch of videos linked, and some tutorial links in the “Welcome” part.
A frequently recommended book. (Also some good blog entries.)
Another good book (a bit more verbose than the previous one)
The git tutorial, also available as the gittutorial man page.
Some github guides, well-organized by levels.
A kind of a collection of small tips; didn't change in a while though.
This is a short visual document about git. But it goes a little fast, so it would be useful after you're comfortable with the basics.