alp.appcom package¶

A simple module to perform training and prediction of models¶

Using celery, this module helps to schedule the training of models if the users send enough models in a short period of time.

class Experiment(model=None, metrics=None, verbose=0)[source]¶

Bases: object

An Experiment trains, predicts, saves and logs a model

Variables:	model (model) – the model used in the experiment metrics (list) – a list of callables

data_id¶

fit(data, data_val, model=None, *args, **kwargs)[source]¶

Build and fit a model given data and hyperparameters

Parameters:	data (list(dict)) – a list of dictionnaries mapping inputs and outputs names to numpy arrays for training. data_val (list(dict)) – a list of dictionnaries mapping inputs and outputs names to numpy arrays for validation. model (model, optionnal) – a model from a supported backend
Returns:	the id of the model in the db, the id of the data in the db and path to the parameters.

fit_async(data, data_val, model=None, *args, **kwargs)[source]¶

Build and fit asynchronously a model given data and hyperparameters

Parameters:	data (list(dict)) – a list of dictionnaries mapping inputs and outputs names to numpy arrays for training. data_val (list(dict)) – a list of dictionnaries mapping inputs and outputs names to numpy arrays for validation. model (model, optionnal) – a model from a supported backend
Returns:	the id of the model in the db, the id of the data in the db and a path to the parameters.

fit_gen(gen_train, data_val, model=None, *args, **kwargs)[source]¶

Build and fit asynchronously a model given data and hyperparameters

Parameters:	gen_train (list(dict)) – a list of generators. data_val (list(dict)) – a list of dictionnaries mapping inputs and outputs names to numpy arrays or generators for validation. model (model, optionnal) – a model from a supported backend
Returns:	the id of the model in the db, the id of the data in the db and a path to the parameters.

fit_gen_async(gen_train, data_val, model=None, *args, **kwargs)[source]¶

Build and fit asynchronously a model given generator(s) and hyperparameters.

Parameters:	gen_train (list(dict)) – a list of generators. data_val (list(dict)) – a list of dictionnaries mapping inputs and outputs names to numpy arrays or generators for validation. model (model, optionnal) – a model from a supported backend
Returns:	the id of the model in the db, the id of the data in the db and a path to the parameters.

load_model(mod_id=None, data_id=None)[source]¶

Load a model from the database form it’s mod_id and data_id

Parameters:	mod_id (str) – the id of the model in the database data_id (str) – the id of the data in the database

mod_id¶

model_dict¶

params_dump¶

predict(data, *args, **kwargs)[source]¶

Make predictions given data

Parameters:	data (np.array)
Returns:	an np.array of predictions

predict_async(data, *args, **kwargs)[source]¶

Make predictions given data

Parameters:	data (np.array)
Returns:	an np.array of predictions

Ensembles module¶

class Ensemble(experiments)[source]¶

Bases: object

Base class to build experiments containers able to execute batch sequences of action. Must implement the fit, fit_gen, fit_async fit_gen_async methods

Parameters:	experiments (dict or list) – experiments to be wrapped. If a dictionnary is passed, it should map experiment names to experiments.

fit(data, data_val, *args, **kwargs)[source]¶

fit_async(data, data_val, *args, **kwargs)[source]¶

fit_gen(data, data_val, *args, **kwargs)[source]¶

fit_gen_async(data, data_val, *args, **kwargs)[source]¶

plt_summary()[source]¶

predict(data, data_val, *args, **kwargs)[source]¶

summary(metrics, verbose=False)[source]¶

class HParamsSearch(experiments, hyperparams=None, metric=None, op=None)[source]¶

Bases: alp.appcom.ensembles.Ensemble

Hyper parameters search class

Train several experiments with different hyperparameters and save results. Wraps the training process so that it’s possible to access results easily.

Parameters:	experiments (dict or list) – experiments to be wrapped. If a dictionnary is passed, it should map experiment names to experiments hyperparams (dict) – a dict of hyperparameters metric (str) – the name of a metric used in the experiments op (str) – an operator to select a model

fit(data, data_val, *args, **kwargs)[source]¶

Apply the fit method to all the experiments

Parameters:	see `alp.core.Experiment.fit`
Returns:	a list of results

fit_async(data, data_val, *args, **kwargs)[source]¶

Apply the fit_async method to all the experiments

Parameters:	see :meth:`alp.appcom.core.Experiment.fit_async`
Returns:	a list of results

fit_gen(data, data_val, *args, **kwargs)[source]¶

Apply the fit_gen method to all the experiments

Parameters:	see :meth:`alp.appcom.core.Experiment.fit_gen`
Returns:	a list of results

fit_gen_async(data, data_val, *args, **kwargs)[source]¶

Apply the fit_gen_async method to all the experiments

Parameters:	see :meth:`alp.appcom.core.Experiment.fit_gen_async`
Returns:	a list of results

predict(data, metric=None, op=None, partial=False, *args, **kwargs)[source]¶

Apply the predict method to all the experiments

Parameters:	see :meth:`alp.appcom.core.Experiment.predict` metric (str) – the name of the metric to use op (function) – an operator returning the value to select an experiment
Returns:	an array of results

summary(metrics, verbose=False)[source]¶

Build a results table using individual results from models

Parameters:	verbose (bool) – if True, print a description of the results metrics (dict) – a dictionnary mapping metric’s names to ops.
Returns:	a pandas DataFrame of results

get_best(experiments, metric, op, partial=False)[source]¶

Helper function for manipulation of a list of experiments

In case of equality in the metric, the behaviour of op_arg determines the result.

Parameters:	experiments (list) – a list of experiments metric (str) – the name of a metric used in the experiments op (function) – operation to perform with the metric (optional) partial (bool) – if True will pass an experiment without result. Raise an error otherwise.

Utility functions for the appcom module¶

background(f)[source]¶: a threading decorator use @background above the function you want to run in the background

check_gen(iterable)[source]¶

Check if the last object of the iterable is an iterator

Parameters:	iterable (list) – a list containing data.
Returns:	True if the last object is a generator, False otherwise.

get_nb_chunks(generator)[source]¶

Get the number of chunks that yields a generator

Parameters:	generator – a Fuel generator
Returns:	number of chunks (int)

imports(packages=None)[source]¶

A decorator to import packages only once when a function is serialized

Parameters:	packages (list or dict) – a list or dict of packages to import. If the object is a dict, the name of the import is the key and the value is the module. If the object is a list, it’s transformed to a dict mapping the name of the module to the imported module.

init_backend(model)[source]¶

Initialization of the backend

Parameters:	backend (str) – only ‘keras’ or ‘sklearn’ at the moment
Returns:	the backend, the backend name and the backend version

list_to_dict(list_to_transform)[source]¶

Transform a list of object to a dict

Parameters:	list_to_transform (list) – the list to transform
Returns:	a dictionnary mapping names of the objects to objects

max_v_len(iterable_to_check)[source]¶: Returns the max length of a list of iterable

norm_iterator(iterable)[source]¶: returns a normalized iterable of tuples

pickle_gen(gen_train, data_val)[source]¶

Check and serialize the validation data object and serialize the training data generator.

Parameters:	gen_train (generator) – the training data generator data_val (dict or generator) – the training data object
Returns:	normalized datasets

switch_backend(backend_name)[source]¶

Switch the backend based on it’s name

Parameters:	backend_name (str) – the name of the backend to import
Returns:	the backend asked

to_fuel_h5(inputs, outputs, slices, names, file_name, file_path='')[source]¶

Transforms list of numpy arrays to a structured hdf5 file

Parameters:

inputs (list) – a list of inputs(numpy.arrays)
outputs (list) – a list of outputs(numpy.arrays)
slices (list) – a list of int representing the end of a slice and the begining of another slice. The last slice is automatically added if missing (maximum length of the inputs).
names (list) – a list of names for the datasets
file_name (str) – the name of the file to save.
file_path (str) – the path where the file is located

Returns:

The file full path

window(seq, n=2)[source]¶: Returns a sliding window (of width n) over data from the iterable