Build systems

At a high level, a build system is a tool that automates the process of constructing a program. More generally, build systems can be used to carry out any defined set of tasks with precedence constraints.

We can consider a build system to be a program that computes a topological ordering of a graph which is then used to construct the nodes in the graph. Each node of the graph is some kind of resource and each edge is a dependency or build instruction. The goal is to ensure that all nodes in the graph are constructed properly. Whenever a node modified, all dependent nodes (defined by dependency edges) must be re-constructed according to the rules (build instruction edges).

For example, consider a simple C program to compute the factorial of a number.

fact.h
#ifndef FACT_H
#define FACT_H

long long fact(int);

#endif
fact.c
#include "fact.h"

long long fact(int n)
{
  long long result = 1;
  while (n > 0) {
    result *= n;
    n--;
  }

  return result;
}
main.c
#include <stdio.h>
#include <stdlib.h>
#include "fact.h"

int main(int argc, char *argv[])
{
  if (argc == 1) {
    printf("usage: %s n1 [n2 [...]]", argv[0]);
    printf("\n\nCompute the factorial of the provided numbers");
    return 0;
  }

  for (int i = 1; i < argc; i++) {
    int n = atoi(argv[i]);
    printf("%lld ", fact(n));
  }
  printf("\n");
}

The steps to compile this program into an executable named main are

gcc -c fact.c
gcc -c main.c
gcc -o main main.o fact.o

If we update any of the files, we need to re-run any commands that involve that file, or involve a file otherwise affected by that change.

../_images/process.png

The dependencies for our factorial program. An arrow from one node to another indicates that the head node must be updated if the tail node is updated. Round nodes represent files or programs, while square nodes represent commands used to create the nodes following them.

This is a very simple program, but there are already many dependencies to remember. We could use a build system to automate the process of checking for updates and rebuilding the program.

Motivation

Build systems can save you time. A large C or C++ program might have a build process that involves compiling and linking dozens of programs; typing each of these compilation commands by hand would be tedious and error-prone. Using a build system avoids these problems.

Build systems help ensure your program is correctly built, and offer a blueprint to future users on how to properly assemble your program.

Build systems work exceptionally well with version control systems. If you check out an older version of your code (or merge in new changes from a contributer) you do not need to remember (or learn) the correct build steps to create the program, you simply run your build system.

Automating common tasks

In addition to helping compile programs, build systems can be used to automate nearly any task. They can be used to set up a testing environment, generate documentation, run tests, deploy changes to a server, or create plots. For example, this document is prepared using SCons.

A common use case for a build system that has very little to do with creating a program is generating a \(\LaTeX\) document. Such a document might contain a number of figures that are generated by code you have written. If you update the code that creates the figures, in general you need to re-create those figures and re-compile your document.

Make

Make is actually a family of programs all sharing a common philosophy. This document assumes we are using GNU Make, although most concepts and syntax should be applicable to Window’s nmake program. GNU Make is a popular and commonly-used build system that is found on most Unix-style operating systems.

Make builds a program or programs by parsing a file (traditionally named Makefile) containing a number of rules. A rule consists of a target, the file to be built, zero or more dependencies, other targets that must be built before this one, and zero or more commands, instructions on how to create the target; a group of commands is called a recipe.

A Makefile for the example from above might look like the following.

all: main

main.o: main.c fact.h
	gcc -c main.c

fact.o: fact.c fact.h
	gcc -c fact.c

main: main.o fact.o
	gcc -o main main.o fact.o

There are four targets in the Makefile, namely all, main.o, fact.o, and main. The all target contains no recipe, but lists main as a dependency. The all target is a special target in a Makefile; it represents the default target to construct if none is otherwise specified. The main.o target contains two dependencies, main.c and fact.h, indicating that if either of these files are updated the target needs to be recreated; if it does, the recipe gcc -c main.c is what is executed to construct it. The fact.o and main targets are similarly defined.

Makefiles support basic scripting, which can be used to reduce boilerplate code. We could re-write the above Makefile as the following

CC = gcc
CFLAGS = -Wall

all: main

main.o: main.c fact.h
	$(CC) $(CFLAGS) -c main.c

fact.o: fact.c fact.h
	$(CC) $(CFLAGS) -c fact.c

main: main.o fact.o
	$(CC) $(CFLAGS) -o main main.o fact.o

This definition of the Makefile makes it so there is a single location where we can add compiler flags to all generated code; here we have turned on compiler warnings. Aside from adding the -Wall compiler option, these two Makefiles are equivalent.

SCons

SCons is a build system that uses Python as a scripting language. Instead of Makefiles, SCons uses SConstruct files; these files are actually valid Python programs. SCons comes with many useful build instructions for creating programs in common languages, such as C, C++, or \(\LaTeX\), and easily lets your create your own. We could replicate the previous Makefile with

env = Environment(CC='gcc',
                  CCFLAGS='-Wall')

env.Program('main', ['main.c', 'fact.c'])

The lecture materials for this course are prepared using SCons; check out the source code using Git!