execo_engine
¶
Overview¶
The execo_engine
module provides tools for the development of
experiments.
Parameter sweeping¶
sweep¶
-
execo_engine.sweep.
sweep
(parameters)¶ Generates all combinations of parameters.
The aim of this function is, given a list of experiment parameters (named factors), and for each parameter (factor), the list of their possible values (named levels), to generate the cartesian product of all parameter values, for a full factorial experimental design (The Art Of Computer Systems Performance Analysis, R. Jain, Wiley 1991).
More formally: given a a dict associating factors as keys and the list of their possible levels as values, this function will return a list of dict corresponding to all cartesian product of all level combinations. Each dict in the returned list associates the factors as keys, and one of its possible levels as value.
In the given factors dict, if for a factor X (key), the value associated is a dict instead of a list of levels, then it will use the keys of the sub-dict as levels for the factor X, and the values of the sub-dict must also be some dict for factor / levels combinations which will be explored only for the corresponding levels of factor X. This is kind of recursive sub-sweep and it allows to explore some factor / level combinations only for some levels of a given factor.
The returned list contains
execo_engine.sweep.HashableDict
instead of dict, which is a simple subclass of dict, so that parameters combinations can be used as dict keys (but don’t modify them in such cases)Examples:
>>> sweep({ ... "param 1": ["a", "b"], ... "param 2": [1, 2] ... }) [{'param 1': 'a', 'param 2': 1}, {'param 1': 'a', 'param 2': 2}, {'param 1': 'b', 'param 2': 1}, {'param 1': 'b', 'param 2': 2}]
>>> sweep({ ... "param 1": ["a", "b"], ... "param 2": { ... 1: { ... "param 1 1": [ "x", "y" ], ... "param 1 2": [ 0.0, 1.0 ] ... }, ... 2: { ... "param 2 1": [ -10, 10 ] ... } ... } ... }) [{'param 1 2': 0.0, 'param 1 1': 'x', 'param 1': 'a', 'param 2': 1}, {'param 1 2': 0.0, 'param 1 1': 'y', 'param 1': 'a', 'param 2': 1}, {'param 1 2': 1.0, 'param 1 1': 'x', 'param 1': 'a', 'param 2': 1}, {'param 1 2': 1.0, 'param 1 1': 'y', 'param 1': 'a', 'param 2': 1}, {'param 2 1': -10, 'param 1': 'a', 'param 2': 2}, {'param 2 1': 10, 'param 1': 'a', 'param 2': 2}, {'param 1 2': 0.0, 'param 1 1': 'x', 'param 1': 'b', 'param 2': 1}, {'param 1 2': 0.0, 'param 1 1': 'y', 'param 1': 'b', 'param 2': 1}, {'param 1 2': 1.0, 'param 1 1': 'x', 'param 1': 'b', 'param 2': 1}, {'param 1 2': 1.0, 'param 1 1': 'y', 'param 1': 'b', 'param 2': 1}, {'param 2 1': -10, 'param 1': 'b', 'param 2': 2}, {'param 2 1': 10, 'param 1': 'b', 'param 2': 2}]
ParamSweeper¶
-
class
execo_engine.sweep.
ParamSweeper
(persistence_dir, sweeps=None, save_sweeps=False, name=None)¶ Multi-process-safe, thread-safe and persistent iterable container to iterate over a list of experiment parameters (or whatever, actually).
The aim of this class is to provide a convenient way to iterate over several experiment configurations (or anything else). It is an iterable container with the following characteristics:
- each element of the iterable has four states:
- todo
- inprogress
- done
- skipped
- at beginning, each element is in state todo
- client code can mark any element done or skipped
- when iterating over it, you always get the next item in todo state
- this container has automatic persistence of the element states done and inprogress (but not state skipped) to disk: If later you instanciate a container with the same persistence directory (path to is given to constructor), then the elements of the container will be taken from the constructor argument, but states done or inprogress will be loaded from persistent state.
- this container is thread-safe and multi-process-safe. Multiple threads can concurrently use a single ParamSweeper object. Multiple threads or processes on the same or different hosts can concurrently use several ParamSweeper instances sharing the same persistence directory. With sufficiently recent linux kernels and nfs servers / clients, it will work on a shared nfs storage (current implementation uses python flock, which should work since kernel 2.6.12. see http://nfs.sourceforge.net/#faq_d10). All threads sharing a ParamSweeper instance synchronize through in-process locks, and all threads / processes with different instances of ParamSweeper sharing the same persistent directory synchronize through the persisted state.
This container is intended to be used in the following way: at the beginning of the experiment, you initialize a ParamSweeper with the list of experiment configurations (which can result from a call to
execo_engine.sweep.sweep
, but not necessarily) and a directory for the persistence. During execution, you request (possibly from several concurrent threads or processes) new experiment configurations withexeco_engine.sweep.ParamSweeper.get_next
, mark them done or skipped withexeco_engine.sweep.ParamSweeper.done
andexeco_engine.sweep.ParamSweeper.skip
. At a later date, you can relaunch the same script, it will continue from where it left, also retrying the skipped configurations. This works well when used withexeco_engine.engine.Engine
startup option-c
(continue experiment in a given directory).execo_engine.sweep.ParamSweeper.skip
is intended to be used when you got a combination withexeco_engine.sweep.ParamSweeper.get_next
but the processing of this combination has failed and you don’t want to retry it later in the same run of your script.But the skipped state is not written to disk, if you relaunch your script later, the previously skipped combinations will again be iterated over.
The intent is for situations where, for example, some combinations cannot be processed because, for example, there are not enough available resources to process them. You may then want to relaunch your script later (after making sure more resources are available) to process these skipped combinations.
If there are some combinations that you never want to retry, even in subsequent runs of your script, then mark them as done with
execo_engine.sweep.ParamSweeper.done
.If there is a combination that you failed to process but you want to retry it in the current run of your script, mark it as canceled with
execo_engine.sweep.ParamSweeper.cancel
.State inprogress is stored on disk to avoid concurrent processes to get the same elements from different ParamSweeper instances (to avoid duplicating work). In some cases (for example if a process has crashed without marking an element done or skipped or canceling it), you may want to reset the inprogress state. This can be done by removing the file
inprogress
in the persistence directory (this can even be done while some ParamSweeper are instanciated and using it).ParamSweeper handle crashes in the following ways: if it crashes (or is killed) while synchronizing to disk, in the worst case, the current element marked done can be lost (i.e. other ParamSweepers or later instanciations will not see it marked done), or the whole list of inprogress elements can be lost.
The ParamSweeper code assumes that in typical usage, there may be a huge number of elements to iterate on (and accordingly, the list of done elements will grow huge too), and that the number of inprogress elements will stay reasonably low. The whole iterable of elements is (optionally) written to disk only once, at ParamSweeper construction. The set of done elements can only grow and is incrementaly appended. The set of inprogress elements is fully read from and written to disk at each operation, thus this may become a bottleneck to ParamSweeper performance if the set of inprogress elements is big.
Parameters: - persistence_dir – path to persistence directory. In this
directory will be created to python pickle files:
done
andinprogress
This files can be erased if needed. - sweeps – An iterable, what to iterate on. If None
(default), try to load it from
persistence_dir
- save_sweeps – boolean. default False. If True, the sweeps are written to disk during initialization (this may take some time but occurs only once)
- name – a convenient name to identify an instance in logs. If None, compute one from persistence_dir.
-
cancel
(combination)¶ cancel processing of the given combination, but don’t mark it as skipped, it comes back in the todo queue.
-
cancel_batch
(combinations)¶ cancel processing of the given combination(s), but don’t mark it/them as skipped, they comes back in the todo queue.
-
done
(combination)¶ mark the given element done
-
done_batch
(combinations)¶ mark the given element(s) done
-
full_update
()¶ Reload completely the ParamSweeper state from disk (may take some time).
-
get_done
()¶ returns an iterable of currently done elements
The returned iterable is a copy (safe to use without fearing concurrent mutations by another thread).
-
get_inprogress
()¶ returns an iterable of elements currently processed (which were obtained by a call to
execo_engine.sweep.ParamSweeper.get_next
, not yet marked done or skipped)The returned iterable is a copy (safe to use without fearing concurrent mutations by another thread).
-
get_next
(filtr=None)¶ Return the next element which is todo. Returns None if reached end.
Parameters: filtr – a filter function. If not None, this filter takes the iterable of remaining elements and returns a filtered iterable. It can be used to filter out some combinations and / or control the order of iteration.
-
get_next_batch
(num_combs, filtr=None)¶ Return the next elements which are todo.
Parameters: - num_combs – how much combinations to get. An array of combinations is returned. The size of the array is <= num_combs and is limited by the number of available remaining combinations
- filtr – a filter function. If not None, this filter takes the iterable of remaining elements and returns a filtered iterable. It can be used to filter out some combinations and / or control the order of iteration.
-
get_remaining
()¶ returns an iterable of current remaining todo elements
The returned iterable is a copy (safe to use without fearing concurrent mutations by another thread).
-
get_skipped
()¶ returns an iterable of current skipped elements
The returned iterable is a copy (safe to use without fearing concurrent mutations by another thread).
-
get_sweeps
()¶ Returns the iterable of what to iterate on
-
reset
(reset_inprogress=False)¶ reset container: iteration will start from beginning, state skipped are forgotten, state done are not forgotten.
Parameters: reset_inprogress – default False. If True, state inprogress is also reset.
-
set_sweeps
(sweeps=None, save_sweeps=False)¶ Change the list of what to iterate on.
Parameters: - sweeps – iterable
- save_sweeps – boolean. default False. If True, the sweeps are written to disk.
-
skip
(combination)¶ mark the given element skipped
-
skip_batch
(combinations)¶ mark the given element(s) skipped
-
stats
()¶ Atomically return the tuple (sweeps, remaining, skipped, inprogress, done)
-
update
()¶ Update incrementaly the ParamSweeper state from disk
fast, except if done file has been truncated or deleted. In this case, will trigger a full_reload.
- each element of the iterable has four states:
sweep_stats¶
-
execo_engine.sweep.
sweep_stats
(stats)¶ taking stats tuple returned by
execo_engine.sweep.ParamSweeper.stats
, and if the ParamSweeper sweeps are in the format output byexeco_engine.sweep.sweep
, returns a dict detailing number and ratios of remaining, skipped, done, inprogress combinations per combination parameter value.
Engine¶
The execo_engine.engine.Engine
class hierarchy is the base for
reusable experiment engines: The class execo_engine.engine.Engine
is
the base class for classes which act as experiment engines.
Engine¶
-
class
execo_engine.engine.
Engine
¶ Bases:
object
Basic class for execo Engine.
Subclass it to develop your own engines, possibly reusable.
This class offers basic facilities:
- central handling of options and arguments
- automatic experiment directory creation
- various ways to handle stdout / stderr
- support for continuing a previously stopped experiment
- log level selection
A subclass of Engine can access the following member variables which are automatically defined and initialized at the right time by the base class
execo_engine.engine.Engine
:execo_engine.engine.Engine.engine_dir
execo_engine.engine.Engine.result_dir
execo_engine.engine.Engine.args_parser
execo_engine.engine.Engine.args
execo_engine.engine.Engine.run_name
execo_engine.engine.Engine.result_dir
A subclass of Engine can override the following methods:
execo_engine.engine.Engine.init
execo_engine.engine.Engine.run
execo_engine.engine.Engine.setup_run_name
execo_engine.engine.Engine.setup_result_dir
A typical, non-reusable engine would start by adding options / arguments to
execo_engine.engine.Engine.args_parser
in__init__()
, then overrideexeco_engine.engine.Engine.init
to perform further initialization if needed. It would then implement all the experiment code by overridingexeco_engine.engine.Engine.run
. This ensures that all initialization steps are performed by the engine before the experiment runs: results directory is initialized and created (possibly reusing a previous results directory, to restart from a previously stopped experiment), log level is set, stdout / stderr are redirected as needed, and options and arguments are inexeco_engine.engine.Engine.args
.Example engine with custom command-line arguments:
class MyEngine(execo_engine.Engine): def __init__(self): super(MyEngine, self).__init__() self.args_parser.add_argument('--myoption', default='foo', help='An option to control how the experiment is done') def run(self): if self.args.myoption == "foo": ...
A typical usage of a
execo_engine.utils.ParamSweeper
in an engine would be to initialize an instance at the beginning ofexeco_engine.engine.Engine.run
, using a persistent file in the results directory. Example code:def run(self): sweeps = sweep({<parameters and their values>}) sweeper = ParamSweeper(sweeps, os.path.join(self.result_dir, "sweeps")) [...]
-
args
= None¶ Arguments and options given on the command line. Available after the command line has been parsed, in
execo_engine.engine.Engine.run
(not inexeco_engine.engine.Engine.init
)
-
args_parser
= None¶ Subclasses of
execo_engine.engine.Engine
can register options and args to this options parser in__init__()
.
-
engine_dir
= None¶ Full path of the engine directory, if available. May not be available if engine code is run interactively.
-
init
()¶ Experiment init method
Override this method with the experiment init code. Default implementation does nothing.
The base class
execo_engine.engine.Engine
takes care that allexeco_engine.engine.Engine.init
methods of its subclass hierarchy are called, in the order ancestor method before subclass method. This order is chosen so that generic engines inheriting fromexeco_engine.engine.Engine
can easily implement common functionnalities. For example a generic engine can declare its own options and arguments ininit
, which will be executed before a particular experiment subclassinit
method.
-
result_dir
= None¶ full path to the current engine’s execution results directory, where results should be written, where stdout / stderr are output, where
execo_engine.utils.ParamSweeper
persistence files should be written, and where more generally any file pertaining to a particular execution of the experiment should be located.
-
run
()¶ Experiment run method
Override this method with the experiment code. Default implementation does nothing.
The base class
execo_engine.engine.Engine
takes care that allexeco_engine.engine.Engine.run
methods of its subclass hierarchy are called, in the order ancestor method before subclass method. This order is chosen so that generic engines inheriting fromexeco_engine.engine.Engine
can easily implement common functionnalities. For example a generic engine can prepare an experiment environment in itsrun
method, which will be executed before a particular experiment subclassrun
method.
-
run_name
= None¶ Name of the current experiment. If you want to modify it, override
execo_engine.engine.Engine.setup_run_name
-
setup_result_dir
()¶ Set the experiment run result directory name.
Default implementation: subdirectory with the experiment run name in the current directory. Override this method to change the name of the result directory. This method is called before
execo_engine.engine.Engine.run
andexeco_engine.engine.Engine.init
. Note that if option-c
is given to the engine, the name given on command line will take precedence, and this method won’t be called.
-
setup_run_name
()¶ Set the experiment run name.
Default implementation: concatenation of class name and date. Override this method to change the name of the experiment. This method is called before
execo_engine.engine.Engine.run
andexeco_engine.engine.Engine.init
-
start
(engineargs=['build_doc'])¶ Start the engine.
Properly initialize the experiment Engine instance, then call the init() method of all subclasses, then pass the control to the overridden run() method of the requested experiment Engine.
Misc¶
slugify¶
-
execo_engine.utils.
slugify
(value)¶ Normalizes string representation, converts to lowercase, removes non-alpha characters, and converts spaces to hyphens.
Intended to convert any object having a relevant string representation to a valid filename.
more or less inspired / copy pasted from django (see http://stackoverflow.com/questions/295135/turn-a-string-into-a-valid-filename-in-python)
logger¶
Default and convenient logger for engines. Inherits its properties from the execo logger.
-
execo_engine.log.
logger
¶
HashableDict¶
-
class
execo_engine.sweep.
HashableDict
¶ Hashable dictionnary. Beware: must not mutate it after its first use as a key.