alp.appcom package

A simple module to perform training and prediction of models

Using celery, this module helps to schedule the training of models if the users send enough models in a short period of time.

class Experiment(model=None, metrics=None, verbose=0)[source]

Bases: object

An Experiment trains, predicts, saves and logs a model

Variables:
  • model (model) – the model used in the experiment
  • metrics (list) – a list of callables
data_id
fit(data, data_val, model=None, *args, **kwargs)[source]

Build and fit a model given data and hyperparameters

Parameters:
  • data (list(dict)) – a list of dictionnaries mapping inputs and outputs names to numpy arrays for training.
  • data_val (list(dict)) – a list of dictionnaries mapping inputs and outputs names to numpy arrays for validation.
  • model (model, optionnal) – a model from a supported backend
Returns:

the id of the model in the db, the id of the data in the db and path to the parameters.

fit_async(data, data_val, model=None, *args, **kwargs)[source]

Build and fit asynchronously a model given data and hyperparameters

Parameters:
  • data (list(dict)) – a list of dictionnaries mapping inputs and outputs names to numpy arrays for training.
  • data_val (list(dict)) – a list of dictionnaries mapping inputs and outputs names to numpy arrays for validation.
  • model (model, optionnal) – a model from a supported backend
Returns:

the id of the model in the db, the id of the data in the db and a path to the parameters.

fit_gen(gen_train, data_val, model=None, *args, **kwargs)[source]

Build and fit asynchronously a model given data and hyperparameters

Parameters:
  • gen_train (list(dict)) – a list of generators.
  • data_val (list(dict)) – a list of dictionnaries mapping inputs and outputs names to numpy arrays or generators for validation.
  • model (model, optionnal) – a model from a supported backend
Returns:

the id of the model in the db, the id of the data in the db and a path to the parameters.

fit_gen_async(gen_train, data_val, model=None, *args, **kwargs)[source]

Build and fit asynchronously a model given generator(s) and hyperparameters.

Parameters:
  • gen_train (list(dict)) – a list of generators.
  • data_val (list(dict)) – a list of dictionnaries mapping inputs and outputs names to numpy arrays or generators for validation.
  • model (model, optionnal) – a model from a supported backend
Returns:

the id of the model in the db, the id of the data in the db and a path to the parameters.

load_model(mod_id=None, data_id=None)[source]

Load a model from the database form it’s mod_id and data_id

Parameters:
  • mod_id (str) – the id of the model in the database
  • data_id (str) – the id of the data in the database
mod_id
model_dict
params_dump
predict(data, *args, **kwargs)[source]

Make predictions given data

Parameters:data (np.array)
Returns:an np.array of predictions
predict_async(data, *args, **kwargs)[source]

Make predictions given data

Parameters:data (np.array)
Returns:an np.array of predictions

Ensembles module

class Ensemble(experiments)[source]

Bases: object

Base class to build experiments containers able to execute batch sequences of action. Must implement the fit, fit_gen, fit_async fit_gen_async methods

Parameters:experiments (dict or list) – experiments to be wrapped. If a dictionnary is passed, it should map experiment names to experiments.
fit(data, data_val, *args, **kwargs)[source]
fit_async(data, data_val, *args, **kwargs)[source]
fit_gen(data, data_val, *args, **kwargs)[source]
fit_gen_async(data, data_val, *args, **kwargs)[source]
plt_summary()[source]
predict(data, data_val, *args, **kwargs)[source]
summary(metrics, verbose=False)[source]
class HParamsSearch(experiments, hyperparams=None, metric=None, op=None)[source]

Bases: alp.appcom.ensembles.Ensemble

Hyper parameters search class

Train several experiments with different hyperparameters and save results. Wraps the training process so that it’s possible to access results easily.

Parameters:
  • experiments (dict or list) – experiments to be wrapped. If a dictionnary is passed, it should map experiment names to experiments
  • hyperparams (dict) – a dict of hyperparameters
  • metric (str) – the name of a metric used in the experiments
  • op (str) – an operator to select a model
fit(data, data_val, *args, **kwargs)[source]

Apply the fit method to all the experiments

Parameters:see `alp.core.Experiment.fit`
Returns:a list of results
fit_async(data, data_val, *args, **kwargs)[source]

Apply the fit_async method to all the experiments

Parameters:see :meth:`alp.appcom.core.Experiment.fit_async`
Returns:a list of results
fit_gen(data, data_val, *args, **kwargs)[source]

Apply the fit_gen method to all the experiments

Parameters:see :meth:`alp.appcom.core.Experiment.fit_gen`
Returns:a list of results
fit_gen_async(data, data_val, *args, **kwargs)[source]

Apply the fit_gen_async method to all the experiments

Parameters:see :meth:`alp.appcom.core.Experiment.fit_gen_async`
Returns:a list of results
predict(data, metric=None, op=None, partial=False, *args, **kwargs)[source]

Apply the predict method to all the experiments

Parameters:
  • see :meth:`alp.appcom.core.Experiment.predict`
  • metric (str) – the name of the metric to use
  • op (function) – an operator returning the value to select an experiment
Returns:

an array of results

summary(metrics, verbose=False)[source]

Build a results table using individual results from models

Parameters:
  • verbose (bool) – if True, print a description of the results
  • metrics (dict) – a dictionnary mapping metric’s names to ops.
Returns:

a pandas DataFrame of results

get_best(experiments, metric, op, partial=False)[source]

Helper function for manipulation of a list of experiments

In case of equality in the metric, the behaviour of op_arg determines the result.

Parameters:
  • experiments (list) – a list of experiments
  • metric (str) – the name of a metric used in the experiments
  • op (function) – operation to perform with the metric (optional)
  • partial (bool) – if True will pass an experiment without result. Raise an error otherwise.

Utility functions for the appcom module

background(f)[source]

a threading decorator use @background above the function you want to run in the background

check_gen(iterable)[source]

Check if the last object of the iterable is an iterator

Parameters:iterable (list) – a list containing data.
Returns:True if the last object is a generator, False otherwise.
get_nb_chunks(generator)[source]

Get the number of chunks that yields a generator

Parameters:generator – a Fuel generator
Returns:number of chunks (int)
imports(packages=None)[source]

A decorator to import packages only once when a function is serialized

Parameters:packages (list or dict) – a list or dict of packages to import. If the object is a dict, the name of the import is the key and the value is the module. If the object is a list, it’s transformed to a dict mapping the name of the module to the imported module.
init_backend(model)[source]

Initialization of the backend

Parameters:backend (str) – only ‘keras’ or ‘sklearn’ at the moment
Returns:the backend, the backend name and the backend version
list_to_dict(list_to_transform)[source]

Transform a list of object to a dict

Parameters:list_to_transform (list) – the list to transform
Returns:a dictionnary mapping names of the objects to objects
max_v_len(iterable_to_check)[source]

Returns the max length of a list of iterable

norm_iterator(iterable)[source]

returns a normalized iterable of tuples

pickle_gen(gen_train, data_val)[source]

Check and serialize the validation data object and serialize the training data generator.

Parameters:
  • gen_train (generator) – the training data generator
  • data_val (dict or generator) – the training data object
Returns:

normalized datasets

switch_backend(backend_name)[source]

Switch the backend based on it’s name

Parameters:backend_name (str) – the name of the backend to import
Returns:the backend asked
to_fuel_h5(inputs, outputs, slices, names, file_name, file_path='')[source]

Transforms list of numpy arrays to a structured hdf5 file

Parameters:
  • inputs (list) – a list of inputs(numpy.arrays)
  • outputs (list) – a list of outputs(numpy.arrays)
  • slices (list) – a list of int representing the end of a slice and the begining of another slice. The last slice is automatically added if missing (maximum length of the inputs).
  • names (list) – a list of names for the datasets
  • file_name (str) – the name of the file to save.
  • file_path (str) – the path where the file is located
Returns:

The file full path

window(seq, n=2)[source]

Returns a sliding window (of width n) over data from the iterable