execo_engine

Overview

The execo_engine module provides tools for the development of experiments.

Parameter sweeping

sweep

execo_engine.sweep.sweep(parameters)

Generates all combinations of parameters.

The aim of this function is, given a list of experiment parameters (named factors), and for each parameter (factor), the list of their possible values (named levels), to generate the cartesian product of all parameter values, for a full factorial experimental design (The Art Of Computer Systems Performance Analysis, R. Jain, Wiley 1991).

More formally: given a a dict associating factors as keys and the list of their possible levels as values, this function will return a list of dict corresponding to all cartesian product of all level combinations. Each dict in the returned list associates the factors as keys, and one of its possible levels as value.

In the given factors dict, if for a factor X (key), the value associated is a dict instead of a list of levels, then it will use the keys of the sub-dict as levels for the factor X, and the values of the sub-dict must also be some dict for factor / levels combinations which will be explored only for the corresponding levels of factor X. This is kind of recursive sub-sweep and it allows to explore some factor / level combinations only for some levels of a given factor.

The returned list contains execo_engine.sweep.HashableDict instead of dict, which is a simple subclass of dict, so that parameters combinations can be used as dict keys (but don’t modify them in such cases)

Examples:

>>> sweep({
...     "param 1": ["a", "b"],
...     "param 2": [1, 2]
...     })
[{'param 1': 'a', 'param 2': 1}, {'param 1': 'a', 'param 2': 2}, {'param 1': 'b', 'param 2': 1}, {'param 1': 'b', 'param 2': 2}]
>>> sweep({
...     "param 1": ["a", "b"],
...     "param 2": {
...         1: {
...             "param 1 1": [ "x", "y" ],
...             "param 1 2": [ 0.0, 1.0 ]
...             },
...         2: {
...             "param 2 1": [ -10, 10 ]
...             }
...         }
...     })
[{'param 1 2': 0.0, 'param 1 1': 'x', 'param 1': 'a', 'param 2': 1}, {'param 1 2': 0.0, 'param 1 1': 'y', 'param 1': 'a', 'param 2': 1}, {'param 1 2': 1.0, 'param 1 1': 'x', 'param 1': 'a', 'param 2': 1}, {'param 1 2': 1.0, 'param 1 1': 'y', 'param 1': 'a', 'param 2': 1}, {'param 2 1': -10, 'param 1': 'a', 'param 2': 2}, {'param 2 1': 10, 'param 1': 'a', 'param 2': 2}, {'param 1 2': 0.0, 'param 1 1': 'x', 'param 1': 'b', 'param 2': 1}, {'param 1 2': 0.0, 'param 1 1': 'y', 'param 1': 'b', 'param 2': 1}, {'param 1 2': 1.0, 'param 1 1': 'x', 'param 1': 'b', 'param 2': 1}, {'param 1 2': 1.0, 'param 1 1': 'y', 'param 1': 'b', 'param 2': 1}, {'param 2 1': -10, 'param 1': 'b', 'param 2': 2}, {'param 2 1': 10, 'param 1': 'b', 'param 2': 2}]

ParamSweeper

class execo_engine.sweep.ParamSweeper(persistence_dir, sweeps=None, save_sweeps=False, name=None)

Multi-process-safe, thread-safe and persistent iterable container to iterate over a list of experiment parameters (or whatever, actually).

The aim of this class is to provide a convenient way to iterate over several experiment configurations (or anything else). It is an iterable container with the following characteristics:

  • each element of the iterable has four states:
    • todo
    • inprogress
    • done
    • skipped
  • at beginning, each element is in state todo
  • client code can mark any element done or skipped
  • when iterating over it, you always get the next item in todo state
  • this container has automatic persistence of the element states done and inprogress (but not state skipped) to disk: If later you instanciate a container with the same persistence directory (path to is given to constructor), then the elements of the container will be taken from the constructor argument, but states done or inprogress will be loaded from persistent state.
  • this container is thread-safe and multi-process-safe. Multiple threads can concurrently use a single ParamSweeper object. Multiple threads or processes on the same or different hosts can concurrently use several ParamSweeper instances sharing the same persistence directory. With sufficiently recent linux kernels and nfs servers / clients, it will work on a shared nfs storage (current implementation uses python flock, which should work since kernel 2.6.12. see http://nfs.sourceforge.net/#faq_d10). All threads sharing a ParamSweeper instance synchronize through in-process locks, and all threads / processes with different instances of ParamSweeper sharing the same persistent directory synchronize through the persisted state.

This container is intended to be used in the following way: at the beginning of the experiment, you initialize a ParamSweeper with the list of experiment configurations (which can result from a call to execo_engine.sweep.sweep, but not necessarily) and a directory for the persistence. During execution, you request (possibly from several concurrent threads or processes) new experiment configurations with execo_engine.sweep.ParamSweeper.get_next, mark them done or skipped with execo_engine.sweep.ParamSweeper.done and execo_engine.sweep.ParamSweeper.skip. At a later date, you can relaunch the same script, it will continue from where it left, also retrying the skipped configurations. This works well when used with execo_engine.engine.Engine startup option -c (continue experiment in a given directory).

execo_engine.sweep.ParamSweeper.skip is intended to be used when you got a combination with execo_engine.sweep.ParamSweeper.get_next but the processing of this combination has failed and you don’t want to retry it later in the same run of your script.

But the skipped state is not written to disk, if you relaunch your script later, the previously skipped combinations will again be iterated over.

The intent is for situations where, for example, some combinations cannot be processed because, for example, there are not enough available resources to process them. You may then want to relaunch your script later (after making sure more resources are available) to process these skipped combinations.

If there are some combinations that you never want to retry, even in subsequent runs of your script, then mark them as done with execo_engine.sweep.ParamSweeper.done.

If there is a combination that you failed to process but you want to retry it in the current run of your script, mark it as canceled with execo_engine.sweep.ParamSweeper.cancel.

State inprogress is stored on disk to avoid concurrent processes to get the same elements from different ParamSweeper instances (to avoid duplicating work). In some cases (for example if a process has crashed without marking an element done or skipped or canceling it), you may want to reset the inprogress state. This can be done by removing the file inprogress in the persistence directory (this can even be done while some ParamSweeper are instanciated and using it).

ParamSweeper handle crashes in the following ways: if it crashes (or is killed) while synchronizing to disk, in the worst case, the current element marked done can be lost (i.e. other ParamSweepers or later instanciations will not see it marked done), or the whole list of inprogress elements can be lost.

The ParamSweeper code assumes that in typical usage, there may be a huge number of elements to iterate on (and accordingly, the list of done elements will grow huge too), and that the number of inprogress elements will stay reasonably low. The whole iterable of elements is (optionally) written to disk only once, at ParamSweeper construction. The set of done elements can only grow and is incrementaly appended. The set of inprogress elements is fully read from and written to disk at each operation, thus this may become a bottleneck to ParamSweeper performance if the set of inprogress elements is big.

Parameters:
  • persistence_dir – path to persistence directory. In this directory will be created to python pickle files: done and inprogress This files can be erased if needed.
  • sweeps – An iterable, what to iterate on. If None (default), try to load it from persistence_dir
  • save_sweeps – boolean. default False. If True, the sweeps are written to disk during initialization (this may take some time but occurs only once)
  • name – a convenient name to identify an instance in logs. If None, compute one from persistence_dir.
cancel(combination)

cancel processing of the given combination, but don’t mark it as skipped, it comes back in the todo queue.

cancel_batch(combinations)

cancel processing of the given combination(s), but don’t mark it/them as skipped, they comes back in the todo queue.

done(combination)

mark the given element done

done_batch(combinations)

mark the given element(s) done

full_update()

Reload completely the ParamSweeper state from disk (may take some time).

get_done()

returns an iterable of currently done elements

The returned iterable is a copy (safe to use without fearing concurrent mutations by another thread).

get_inprogress()

returns an iterable of elements currently processed (which were obtained by a call to execo_engine.sweep.ParamSweeper.get_next, not yet marked done or skipped)

The returned iterable is a copy (safe to use without fearing concurrent mutations by another thread).

get_next(filtr=None)

Return the next element which is todo. Returns None if reached end.

Parameters:filtr – a filter function. If not None, this filter takes the iterable of remaining elements and returns a filtered iterable. It can be used to filter out some combinations and / or control the order of iteration.
get_next_batch(num_combs, filtr=None)

Return the next elements which are todo.

Parameters:
  • num_combs – how much combinations to get. An array of combinations is returned. The size of the array is <= num_combs and is limited by the number of available remaining combinations
  • filtr – a filter function. If not None, this filter takes the iterable of remaining elements and returns a filtered iterable. It can be used to filter out some combinations and / or control the order of iteration.
get_remaining()

returns an iterable of current remaining todo elements

The returned iterable is a copy (safe to use without fearing concurrent mutations by another thread).

get_skipped()

returns an iterable of current skipped elements

The returned iterable is a copy (safe to use without fearing concurrent mutations by another thread).

get_sweeps()

Returns the iterable of what to iterate on

reset(reset_inprogress=False)

reset container: iteration will start from beginning, state skipped are forgotten, state done are not forgotten.

Parameters:reset_inprogress – default False. If True, state inprogress is also reset.
set_sweeps(sweeps=None, save_sweeps=False)

Change the list of what to iterate on.

Parameters:
  • sweeps – iterable
  • save_sweeps – boolean. default False. If True, the sweeps are written to disk.
skip(combination)

mark the given element skipped

skip_batch(combinations)

mark the given element(s) skipped

stats()

Atomically return the tuple (sweeps, remaining, skipped, inprogress, done)

update()

Update incrementaly the ParamSweeper state from disk

fast, except if done file has been truncated or deleted. In this case, will trigger a full_reload.

sweep_stats

execo_engine.sweep.sweep_stats(stats)

taking stats tuple returned by execo_engine.sweep.ParamSweeper.stats, and if the ParamSweeper sweeps are in the format output by execo_engine.sweep.sweep, returns a dict detailing number and ratios of remaining, skipped, done, inprogress combinations per combination parameter value.

geom

execo_engine.sweep.geom(range_min, range_max, num_steps)

Return a geometric progression from range_min to range_max with num_steps

igeom

execo_engine.sweep.igeom(range_min, range_max, num_steps)

Return an integer geometric progression from range_min to range_max with num_steps

Engine

The execo_engine.engine.Engine class hierarchy is the base for reusable experiment engines: The class execo_engine.engine.Engine is the base class for classes which act as experiment engines.

Engine

class execo_engine.engine.Engine

Bases: object

Basic class for execo Engine.

Subclass it to develop your own engines, possibly reusable.

This class offers basic facilities:

  • central handling of options and arguments
  • automatic experiment directory creation
  • various ways to handle stdout / stderr
  • support for continuing a previously stopped experiment
  • log level selection

A subclass of Engine can access the following member variables which are automatically defined and initialized at the right time by the base class execo_engine.engine.Engine:

A subclass of Engine can override the following methods:

A typical, non-reusable engine would start by adding options / arguments to execo_engine.engine.Engine.args_parser in __init__(), then override execo_engine.engine.Engine.init to perform further initialization if needed. It would then implement all the experiment code by overriding execo_engine.engine.Engine.run. This ensures that all initialization steps are performed by the engine before the experiment runs: results directory is initialized and created (possibly reusing a previous results directory, to restart from a previously stopped experiment), log level is set, stdout / stderr are redirected as needed, and options and arguments are in execo_engine.engine.Engine.args.

Example engine with custom command-line arguments:

class MyEngine(execo_engine.Engine):
    def __init__(self):
        super(MyEngine, self).__init__()
        self.args_parser.add_argument('--myoption', default='foo',
                                      help='An option to control how the experiment is done')
    def run(self):
        if self.args.myoption == "foo":
            ...

A typical usage of a execo_engine.utils.ParamSweeper in an engine would be to initialize an instance at the beginning of execo_engine.engine.Engine.run, using a persistent file in the results directory. Example code:

def run(self):
    sweeps = sweep({<parameters and their values>})
    sweeper = ParamSweeper(sweeps, os.path.join(self.result_dir, "sweeps"))
    [...]
args = None

Arguments and options given on the command line. Available after the command line has been parsed, in execo_engine.engine.Engine.run (not in execo_engine.engine.Engine.init)

args_parser = None

Subclasses of execo_engine.engine.Engine can register options and args to this options parser in __init__().

engine_dir = None

Full path of the engine directory, if available. May not be available if engine code is run interactively.

init()

Experiment init method

Override this method with the experiment init code. Default implementation does nothing.

The base class execo_engine.engine.Engine takes care that all execo_engine.engine.Engine.init methods of its subclass hierarchy are called, in the order ancestor method before subclass method. This order is chosen so that generic engines inheriting from execo_engine.engine.Engine can easily implement common functionnalities. For example a generic engine can declare its own options and arguments in init, which will be executed before a particular experiment subclass init method.

result_dir = None

full path to the current engine’s execution results directory, where results should be written, where stdout / stderr are output, where execo_engine.utils.ParamSweeper persistence files should be written, and where more generally any file pertaining to a particular execution of the experiment should be located.

run()

Experiment run method

Override this method with the experiment code. Default implementation does nothing.

The base class execo_engine.engine.Engine takes care that all execo_engine.engine.Engine.run methods of its subclass hierarchy are called, in the order ancestor method before subclass method. This order is chosen so that generic engines inheriting from execo_engine.engine.Engine can easily implement common functionnalities. For example a generic engine can prepare an experiment environment in its run method, which will be executed before a particular experiment subclass run method.

run_name = None

Name of the current experiment. If you want to modify it, override execo_engine.engine.Engine.setup_run_name

setup_result_dir()

Set the experiment run result directory name.

Default implementation: subdirectory with the experiment run name in the current directory. Override this method to change the name of the result directory. This method is called before execo_engine.engine.Engine.run and execo_engine.engine.Engine.init. Note that if option -c is given to the engine, the name given on command line will take precedence, and this method won’t be called.

setup_run_name()

Set the experiment run name.

Default implementation: concatenation of class name and date. Override this method to change the name of the experiment. This method is called before execo_engine.engine.Engine.run and execo_engine.engine.Engine.init

start(engineargs=['build_doc'])

Start the engine.

Properly initialize the experiment Engine instance, then call the init() method of all subclasses, then pass the control to the overridden run() method of the requested experiment Engine.

Misc

slugify

execo_engine.utils.slugify(value)

Normalizes string representation, converts to lowercase, removes non-alpha characters, and converts spaces to hyphens.

Intended to convert any object having a relevant string representation to a valid filename.

more or less inspired / copy pasted from django (see http://stackoverflow.com/questions/295135/turn-a-string-into-a-valid-filename-in-python)

logger

Default and convenient logger for engines. Inherits its properties from the execo logger.

execo_engine.log.logger

HashableDict

class execo_engine.sweep.HashableDict

Hashable dictionnary. Beware: must not mutate it after its first use as a key.

redirect_outputs

execo_engine.utils.redirect_outputs(stdout_filename, stderr_filename)

Redirects, and optionnaly merge, stdout and stderr to files

copy_outputs

execo_engine.utils.copy_outputs(stdout_filename, stderr_filename)

Copy, and optionnaly merge, stdout and stderr to file(s)