Version control

Version control is software used to track modifications to a collection of files over time. The goal is not only to be able to determine exactly who made which modifications, but to also be able to have access to the full history of the project at any time.

This page is not meant to be a tutorial on using any particular version control system; there are excellent tutorials available online (and your system man pages) to consult.

There are many version control systems (VCS) available today, and they can largely be categorized as either centralized or distributed version control systems. We will focus our attention on one particular distributed version control system (DVCS), Git.

General terminology

There are several basic operations that are used in Git.

Repository

A collection of files under version control. You can think of a repository as a project.

Cloning

Making a (local) copy of a (remote) repository. If you see a Git repository online that you’d like to modify, you would clone it to your local machine first.

Checkout

Moving your repository contents to a point in the project history. If you want to go back in your project’s history or switch to another branch, you perform a checkout.

Commit (noun)

A pointer to a checkpoint in the revision history.

Commit (verb)

Creating a checkpoint in the revision history. If you have created a set of changes to a collection of files and want to mark this work in the history, you commit your changes.

Branch (noun)

A line of development within the repository. There can be many branches in a single repository. For example, a new feature might be developed on a feature branch, while a stable copy of the working project might live on the master branch.

Branch (verb)

Creating a new branch in the repository.

Pulling

Fetching and merging changes from a remote repository with your local repository. This action should be performed whenever you want to get changes from another repository.

Pushing

Publishing local changes to a repository to a remote repository.

Merging

Bringing changes from one repository or branch into a local branch.

Version control as a graph

There is a very nice representation of version control systems as a directed acyclic graph. In this graph each node is a commit representing the full state of the repository at some point in the history. There is a root node representing the start of the project. Each other node is connected to another node by one or more arcs. An arc between two nodes indicates that the head of the arc is a parent of the tail of the arc. In this model a branch is simply a selected node along with all nodes reachable from this node back to the root.

In Git, you can see your revision history as a graph with the command git log --graph. At the time of this writing, the command git log --graph --oneline produces the following output:

* 773d5b0 Updated ignored files
* cf70bf9 Missed the SConstruct file.
* 42b8781 Ignoring the build directory.
* 50ef5a7 Set up sphinx.
* 13602ed Initial commit.

(If you are interested, the commit that generated this history is 773d5b0a010d390d95cf4b22eafb5d1a5b5e0ad2 – you can see the first 7 characters of this commit in the first line of the log).

Workflows

There are many ways to use version control effectively. You can choose any one you like, but I would suggest the “feature branch workflow” as it works nicely for collaboration in small groups.