Skip to content

Commit

Permalink
doc: Adding documentation for forthcoming changes in 0.4.0, updating …
Browse files Browse the repository at this point in the history
…local sphinx style, and reorganizing main index
  • Loading branch information
jbohren committed Dec 16, 2015
1 parent 8d5e87f commit 3c76637
Show file tree
Hide file tree
Showing 17 changed files with 10,748 additions and 350 deletions.
1,185 changes: 1,185 additions & 0 deletions docs/advanced/executor_events.dia

Large diffs are not rendered by default.

475 changes: 475 additions & 0 deletions docs/advanced/executor_events.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
319 changes: 319 additions & 0 deletions docs/advanced/executor_job_lifecycle.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
472 changes: 472 additions & 0 deletions docs/advanced/executor_job_resources.svg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
196 changes: 196 additions & 0 deletions docs/advanced/job_executor.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,196 @@
The Catkin Execution Engine
===========================

One of the core modules in ``catkin_tools`` is the **job executor**. The
executor performs jobs required to complete a task in a way that maximizes (or
achives a specific) resource utilization subject to job dependency constraints.
The executor is closely integrated with logging and job output capture. This
page details the design and implementation of the executor.

Execution Model
^^^^^^^^^^^^^^^

The execution model is fairly simple. The executor executes a single **task**
for a given command (i.e. ``build``, ``clean``, etc.). A **task** is a set of
**jobs** which are related by an acyclic dependency graph. Each **job** is
given a unique identifier and is composed of a set of dependencies and a
sequence of executable **stages**, which are arbitrary functions or subprocess
calls which utilize one or more **workers** to be executed. The allocation of
workers is managed by the **job server**. Throughout execution, synchronization
with the user-facing interface and output formatting are mediated by a simple
**event queue**.

The executor is single-threaded and uses an asynchronous loop to execute jobs
as futures. If a job contains blocking stages it can utilize a normal thread
pool for execution, but is still only guaranteed one worker by the main loop of
the executor. See the following section for more information on workers and the
job server.

The input to the executor is a list of topologically-sorted jobs with no
circular dependencies and some parameters which control the jobserver behavior.
These behavior parameters are explained in detail in the following section.

Each job is in one of the following lifecycle states at any time:

- ``PENDING`` Not ready to be executed (dependencies not yet completed)
- ``QUEUED`` Ready to be executed once workers are available
- ``ACTIVE`` Being executed by one or more workers
- ``FINISHED`` Has been executed and either succeded or failed (terminal)
- ``ABANDONED`` Was not built because a prerequisite was not met (terminal)

.. figure:: executor_job_lifecycle.svg
:scale: 50 %
:alt: Executor Job Lifecycle

**Executor Job lifecycle**

All jobs begin in the ``PENDING`` state, and any jobs with unsatisfiable
dependencies are immediately set to ``ABANDONED``, and any jobs without
dependencies are immediately set to ``QUEUED``. After the state initialization,
the executor processes jobs in a main loop until they are in one of the two
terminal states (``FINISHED`` or ``ABANDONED``).

Each main loop iteration does the following:

- While job server tokens are available, create futures for ``QUEUED`` jobs
and make them ``ACTIVE``
- Report status of all jobs to the event queue
- Retrieve ``ACTIVE`` job futures which have completed and set them
``FINISHED``
- Check for any ``PENDING`` jobs which need to be ``ABANDONED`` due to failed
jobs
- Change all ``PENDING`` jobs whose dependencies are satisifed to ``QUEUED``

Once each job is in one of terminal states, the executor pushes a final status
event and returns.


Job Server Resource Model
^^^^^^^^^^^^^^^^^^^^^^^^^

As mentioned in the previous section, each task includes a set of jobs which
are activated by the **job server**. In order to start a queued job, at least
one worker needs to be available. Once a job is started, it is assigned a
single worker from the job server. These are considered **top-level jobs**
since they are managed directly by the catkin executor. The number of top-level
jobs can be configured for a given task.

Additionally to top-level paralellism, some job stages are capable of running
in parallel, themselves. In such cases, the job server can interface directly
with the underlying stage's low-level job allocation. This enables multi-level
parallelism without allocating more than a fixed number of jobs.


.. figure:: executor_job_resources.svg
:scale: 50 %
:alt: Executor job resources

**Executor Job Flow and Resource Utilization** -- In this snapshot of the job pipeline, the executor is executing four of six possible top-level jobs, each with three stages, and using sevel of eight total workers. Two jobs are executing subprocesses, which have side-channel communication with the job server.

One such parallel-capable stage is the GNU Make build stage. In this case, the
job server implements a GNU Make job server interface, which involves reading
and writing tokens from file handles passed as build flags to the Make command.

For top-level jobs, additional resources are monitored in addition to the
number of workers. Both system load and memory utilization checks can be
enabled to prevent overloading a system.

Executor Job Failure Behavior
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

The executor's behavior when a job fails can be modified with the following two
parameters:

- ``continue_on_failure`` Continue executing jobs even if one job fails. If
this is set to ``false`` (the default), it will cause the executor to
abandon all pending and queued jobs and stop after the first failure. Note
that active jobs will still be allowed to complete before the executor
returns.
- ``continue_without_deps`` Continue executing jobs even if one
or more of their dependencies have failed. If this is set to ``false`` (the
default), it will cause the executor to abandon only the jobs which depend
on the failed job. If it is set to ``true``, then it will build dependent
jobs regardless.


Jobs and Job Stages
^^^^^^^^^^^^^^^^^^^

As mentioned above, a **job** is a set of dependencies and a sequence of **job
stages**. Jobs and stages are constructed before a given task starts executing,
and hold only specificaitons of what needs to be done to complete them. All
stages are given a label for user introspection, a logger interface, and can
either require or not require allocation of a worker from the job server.

Stage execution is performed asynchronously by Python's ``asyncio`` module.
This means that exceptions thrown in job stages are handled directly by the
executor. It also means job stages can be interrupted easily through Python's
normal signal handling mechanism.

Stages can either be **command stages** (subprocess commands) or **function
stages** (python functions). In either case, loggers used by stages support
segmentation of ``stdout`` and ``stderr`` from job stages for both real-time
introspection and logging.


Command Stages
~~~~~~~~~~~~~~~

In addition to the basic arguments mentioned above, command stages are
paramterized by the standard subprocess command arguments including the
following:

- The command, itself, and its arguments,
- The working directory for the command,
- Any additional environment variables,
- Whether to use a shell interpreter
- Whether to emulate a TTY
- Whether to partition ``stdout`` and ``stderr``

When executed, command stages use ``asncio``'s asynchronous process executor
with a custom I/O protocol.

Function Stages
~~~~~~~~~~~~~~~

In addition to the basic arguments mentioned above, function stages are
parameterized by a function handle and a set of function-specific Python
arguments and keyword arguments. When executed, they use the thread pool
mentioned above.

Since the function stages aren't subprocesses, I/O isn't piped or redirected.
Instead, a custom I/O logger is passed to the function for output. Functions
used as function stages should use this logger to write to ``stdout`` and
``stderr`` instead of using normal system calls.

Introspection via Executor Events
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^

Introspection into the different asynchronously-executed components of a task
is performed by a simple event queue. Events are created by the executor,
loggers, and stages, and they are consumed by an output controller. Events are
defined by an event identifier and a data payload, which is an arbitrary
dictionary.

There are numerous events which correspond to changes in job states, but events are also used for transporting captured I/O from job stages.

.. figure:: executor_events.svg
:scale: 50 %
:alt: Executor Event Pipeline

**Executor Event Pipeline** -- Above, the executor writes events to the event queue, and the I/O loggers used by function and command stages write output events as well. All of these events are handled by the output controller, which writes to the real ``stdout`` and ``stderr``.

The modeled events include the following:

- ``JOB_STATUS`` A report of running job states,
- ``QUEUED_JOB`` A job has been queued to be executed,
- ``STARTED_JOB`` A job has started to be executed,
- ``FINISHED_JOB`` A job has finished executing (succeeded or failed),
- ``ABANDONED_JOB`` A job has been abandoned for some reason,
- ``STARTED_STAGE`` A job stage has started to be executed,
- ``FINISHED_STAGE`` A job stage has finished executing (succeeded or failed),
- ``STAGE_PROGRESS`` A job stage has executed partially,
- ``STDOUT`` A status message from a job,
- ``STDERR`` A warning or error message from a job,
- ``SUBPROCESS`` A subprocess has been created,
- ``MESSAGE`` Arbitrary string message
92 changes: 92 additions & 0 deletions docs/build_types.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,92 @@
Supported Build Types
=====================

The current release of ``catkin_tools`` supports building two types of packages:

- **Catkin** -- CMake packages that use the Catkin CMake macros
- **CMake** -- "Plain" CMake packages

There is currently limited support for adding other build types. For information
on extending ``catkin_tools`` to be able to build other types of packages, see
:doc:`Adding New Build Types <development/adding_build_types>`. Below are
details on the stages involved in building a given package for each of
the currently-supported build types.

Catkin
^^^^^^

Catkin packages are CMake packages which utilize the Catkin CMake macros for
finding packages and defining configuration files.

Configuration Arguments
-----------------------

- ``--cmake-args``
- ``--make-args``
- ``--catkin-make-args``

Build Stages
------------

============== ============ ==================================================
First Subsequent Description
============== ============ ==================================================
``mkdir`` | Create package build space if it doesn't exist.
---------------------------- --------------------------------------------------
``envgen`` | Generate environment setup file for building.
---------------------------- --------------------------------------------------
``cmake`` ``check`` | Run CMake configure step **once** for the
| first build and the ``cmake_check_build_system``
| target for subsequent builds unless the
| ``--force-cmake`` argument is given.
-------------- ------------ --------------------------------------------------
``preclean`` `optional` | Run the ``clean`` target before building.
| This is only done with the ``--pre-clean`` \
option.
---------------------------- --------------------------------------------------
``make`` | Build the default target with GNU make.
---------------------------- --------------------------------------------------
``install`` `optional` | Run the ``install`` target after building.
| This is only done with the ``--install`` option.
============================ ==================================================

CMake
^^^^^

Configuration Arguments
-----------------------

- ``--cmake-args``
- ``--make-args``

Build Stages
------------

============== ============ ==================================================
First Subsequent Description
============== ============ ==================================================
``mkdir`` | Create package build space if it doesn't exist.
---------------------------- --------------------------------------------------
``envgen`` | Generate environment setup file for building.
---------------------------- --------------------------------------------------
``cmake`` ``check`` | Run CMake configure step **once** for the
| first build and the ``cmake_check_build_system``
| target for subsequent builds unless the
| ``--force-cmake`` argument is given.
-------------- ------------ --------------------------------------------------
``preclean`` `optional` | Run the ``clean`` target before building.
| This is only done with the ``--pre-clean`` \
option.
---------------------------- --------------------------------------------------
``make`` | Build the default target with GNU make.
---------------------------- --------------------------------------------------
``install`` | Run the ``install`` target after building,
| and install products to the **devel space**.
| If the ``--install`` option is given,
| products are installed to the \
**install space** instead.
---------------------------- --------------------------------------------------
``setupgen`` | Generate a ``setup.sh`` file if necessary.
============================ ==================================================


Loading

0 comments on commit 3c76637

Please sign in to comment.