SCons Taskmaster

The Taskmaster is responsible for taking the directed acyclic graph (DAG) created during the parse phase, evaluating it, and arranging for the execution of any commands needed to bring the project up to date. As part of its processing, it scans source files for implicit dependencies and adds them to the DAG. To determine if something needs to be built or rebuilt, it compares the signatures of the antecedents; if they have changed from the last run, the Taskmaster runs a command and then generates new signatures.

Processing

The Taskmaster is called from the Jobs module. The Jobs module pulls a batch of tasks (a subclass of Task wraps a runnable Node) and runs the batch to completion. The Jobs module repeats this process until the Taskmaster tells it that there are no more runnable tasks. If the Jobs module is in parallel mode, the batch size is set by the -j parameter on the command line. If the number of jobs is one (the default) or if the Python implementation does not support parallel execution, the Jobs module operates in serial mode and the batch size is one.

When the Taskmaster is instantiated, it is given the specific class to use to wrap the runnable nodes. This class is normally a subclass of the Task class (also defined in the Taskmaster module) and contains the logic to process a node. Subclasses are used to cause the Taskmaster to operate in different modes; the subclass provides the behavior that is specific to the mode. Different subclasses are used for a standard build (BuildTask), for cleaning targets (CleanTask), for determining if a build is up-to-date (QuestionTask), and for configuring (SConfBuildTask).

Running the DAG

The classic technique for processing elements in a DAG is called a topological sort. A topological sort of a DAG is a linear listing of its nodes such that each node comes before any other node that depends on it. For greater detail about topological sorting, look at the Wikipedia entry for topological sort. There are two classic algorithms: if the graph is pre-built, a recursive descent (also called graph coloring) approach may be used; if only the relationship information is available (pairs of nodes where one must precede the other), a traversal tree approach can be used, which evaluates a work queue of nodes whose predecessors have been evaluated (each evaluation potentially releases additional nodes into the work queue).

For the recursive descent approach, each node is visited exactly once, so the run time is linear in the number of nodes (i.e., O(n)). For the traversal tree approach, each relationship link must be used to build the traversal tree and then each node is visited exactly once, so the run time is linear in the number of nodes plus the number of dependencies (i.e., O(n+d)).

In general, the parse phase creates a larger DAG than is used by any specific build. Because of this, and because of the need to handle generated header files and scanned implicit dependencies, the SCons Taskmaster does not use a topological sort directly. Instead of sorting all of the Nodes in the DAG, the Taskmaster starts walking the DAG at a given Node (one of the targets specified on the command line) and looks for child Nodes (sources or dependencies, implicit or explicit) that are ready to be built. In essence, it's a recursive-descent sort that stops when it finds the first unevaluated Node.

The Taskmaster keeps a list of candidates (analogous to the "work queue" of a traversal-tree topological sort), the Nodes that the Taskmaster is ready to examine either because each was an explicitly-specified top-level target, or because they're somewhere down the walk of the tree from those top-level targets--that is, children (or children of children of children...) of a target that SCons has been requested to make sure is up-to-date (either because it needs to be built or because it already was up-to-date). This optimization so that the full recursive descent does not need to be done means that often only a small subset of the full DAG needs to be visited on each call.

The Taskmaster only puts Nodes back on the candidates list after the children for which it is waiting complete their builds. Putting a Node back on the candidate list will cause it (and/or its sources) to be re-scanned, if necessary. This is how implicit dependencies within generated header files get detected, because generated files can get re-scanned as necessary. (For efficiency, source files that haven't changed--been built--since the last time they were scanned can just return their previous results, but that's an optimization handled within the Node layer itself.)

The Taskmaster is only concerned with the order in which Nodes are evaluated. It does not itself look at any signature information to decide whether a Node being evaluated is up to date or needs to be rebuilt; that is delegated to the Task class, once the Taskmaster has decided that all of its dependencies have been satisfied.

If a build fails and the -k flag causes evaluations to continue, the Node is marked as poisoned. When a subsequent Node is evaluated and finds an antecedent that is poisoned, that Node is also marked poisoned and the execution of its command is skipped.

Before a Node is selected for execution, it is checked to see if any of its side-effects conflict with the side-effects of any Tasks currently being built. If there are any conflicts, the Node's execution is delayed until all side-effects are cleared.

The Taskmaster algorithm only runs in one thread and is not itself parallelized. That means child detection and reference counts do not need to be thread-safe. What is parallelized (that is, handled by separate threads when the -j option is used) is the evaluation of whether the Node is up-to-date and the execution of its build, if necessary.

It's worth noting that the Jobs module calls the Taskmaster once for each node to be processed (i.e., it's O(n)) and the Taskmaster has an amortized performance of O(n) each time it's called. Thus, the overall time is O(n^2). It takes a pathological DAG to achieve the worst-case performance, but it occasionally happens in practice.

Scanning for implicit dependencies

Running scanners; adding to the graph when a scanner reports a dependency.

(Future?) Extending the graph based on the result of executing a command

The restrictions imposed so that this can work.

Dealing with signatures

Signature types, calculating signatures, comparing signatures.


Taskmaster Module

The Taskmaster class

Here are the steps performed by the Taskmaster to select the next job to run:

The Task class

(((TODO: Tasks and their entry points.)))

The Stats class

(((TODO: Statistics accumulated by the Taskmaster)))


Jobs Module

The Jobs class

(((redirects to either Serial or Parallel)))

The Serial class

(((does synchronous execution)))

The Parallel class

(((does parallel execution)))


Script Module (partial)

(((entry points for SCons)))

Initialization

(((setup)))

Main Sub-Module

(((contains definitions of three subclasses of Task)))

(((instantiates Jobs and Taskmaster; runs them)))

SConscript Sub-Module

(((API support for SConscripts)))


This is a work in progress. Click on the "Edit(Text)" link below, and let's start discussing what needs to go in this section.

DeveloperGuide/TaskMaster (last edited 2008-03-12 02:47:10 by localhost)