neuraxle.metaopt.auto_ml

Neuraxle’s AutoML Classes

Classes used to build any Automatic Machine Learning strategies.

Functions

kfold_cross_validation_split(data_inputs, k_fold)

validation_split(test_size, data_inputs[, …])

Split data inputs, and expected outputs into a training set, and a validation set.

Classes

AutoML(pipeline, validation_splitter, …[, …])

A step to execute any Automatic Machine Learning Algorithms.

AutoMLContainer(trials, …)

Data object for auto ml.

BaseHyperparameterSelectionStrategy

BaseValidationSplitter

HyperparamsJSONRepository(…[, …])

Hyperparams repository that saves json files for every AutoML trial.

HyperparamsRepository([…])

Hyperparams repository that saves hyperparams, and scores for every AutoML trial.

InMemoryHyperparamsRepository([…])

In memory hyperparams repository that can print information about trials.

KFoldCrossValidationSplitter(k_fold)

Create a function that splits data with K-Fold Cross-Validation resampling.

RandomSearchHyperparameterSelectionStrategy()

AutoML Hyperparameter Optimizer that randomly samples the space of random variables.

Trainer(epochs[, metrics, callbacks, …])

Example usage :

ValidationSplitter(test_size)

Create a function that splits data into a training, and a validation set.

class neuraxle.metaopt.auto_ml.AutoML(pipeline: neuraxle.base.BaseStep, validation_splitter: neuraxle.metaopt.auto_ml.BaseValidationSplitter, refit_trial: bool, scoring_callback: neuraxle.metaopt.callbacks.ScoringCallback, hyperparams_optimizer: neuraxle.metaopt.auto_ml.BaseHyperparameterSelectionStrategy = None, hyperparams_repository: neuraxle.metaopt.auto_ml.HyperparamsRepository = None, n_trials: int = 10, epochs: int = 1, callbacks: List[neuraxle.metaopt.callbacks.BaseCallback] = None, refit_scoring_function: Callable = None, print_func: Callable = None, cache_folder_when_no_handle=None)[source]

A step to execute any Automatic Machine Learning Algorithms.

Example usage :

auto_ml = AutoML(
    pipeline,
    n_trials=n_iter,
    validation_split_function=validation_splitter(0.2),
    hyperparams_optimizer=RandomSearchHyperparameterSelectionStrategy(),
    scoring_callback=ScoringCallback(mean_squared_error, higher_score_is_better=False),
    callbacks=[
        MetricCallback('mse', metric_function=mean_squared_error, higher_score_is_better=False)
    ],
    refit_trial=True,
    print_metrics=False,
    cache_folder_when_no_handle=str(tmpdir)
)

auto_ml = auto_ml.fit(data_inputs, expected_outputs)
get_best_model()[source]

Get best model using the hyperparams repository.

Returns

class neuraxle.metaopt.auto_ml.AutoMLContainer(trials: neuraxle.metaopt.trial.Trials, hyperparameter_space: neuraxle.hyperparams.space.HyperparameterSpace, trial_number: int, main_scoring_metric_name: str)[source]

Data object for auto ml.

class neuraxle.metaopt.auto_ml.BaseHyperparameterSelectionStrategy[source]
find_next_best_hyperparams(auto_ml_container: neuraxle.metaopt.auto_ml.AutoMLContainer) → neuraxle.hyperparams.space.HyperparameterSamples[source]

Find the next best hyperparams using previous trials.

Parameters

auto_ml_container – trials data container

Returns

next best hyperparams

class neuraxle.metaopt.auto_ml.BaseValidationSplitter[source]
split(data_inputs, expected_outputs=None) → Tuple[List[T], List[T], List[T], List[T]][source]
split_data_container(data_container: neuraxle.data_container.DataContainer) → List[Tuple[neuraxle.data_container.DataContainer, neuraxle.data_container.DataContainer]][source]

Wrap a validation split function with a split data container function. A validation split function takes two arguments: data inputs, and expected outputs.

Parameters

data_container – data container to split

Returns

a function that returns the pairs of training, and validation data containers for each validation split.

class neuraxle.metaopt.auto_ml.HyperparamsJSONRepository(hyperparameter_selection_strategy: Optional[neuraxle.metaopt.auto_ml.BaseHyperparameterSelectionStrategy] = None, cache_folder=None, best_retrained_model_folder=None)[source]

Hyperparams repository that saves json files for every AutoML trial.

Example usage :

HyperparamsJSONRepository(
    hyperparameter_selection_strategy=RandomSearchHyperparameterSelectionStrategy(),
    cache_folder='cache',
    best_retrained_model_folder='best'
)
load_all_trials(status: Optional[neuraxle.metaopt.trial.TRIAL_STATUS] = None) → neuraxle.metaopt.trial.Trials[source]

Load all hyperparameter trials with their corresponding score. Reads all the saved trial json files, sorted by creation date.

Returns

(hyperparams, scores)

new_trial(auto_ml_container: neuraxle.metaopt.auto_ml.AutoMLContainer)[source]

Create new hyperperams trial json file.

Parameters

auto_ml_container – auto ml container

Returns

save_trial(trial: neuraxle.metaopt.trial.Trial)[source]

Save trial json.

Parameters

trial – trial to save

Returns

class neuraxle.metaopt.auto_ml.HyperparamsRepository(hyperparameter_selection_strategy=None, cache_folder=None, best_retrained_model_folder=None)[source]

Hyperparams repository that saves hyperparams, and scores for every AutoML trial.

get_best_hyperparams() → neuraxle.hyperparams.space.HyperparameterSamples[source]

Get best hyperparams from all of the saved trials.

Returns

best hyperparams.

get_best_model()[source]

Load the best model saved inside the best retrained model folder.

Returns

load_all_trials(status: neuraxle.metaopt.trial.TRIAL_STATUS) → neuraxle.metaopt.trial.Trials[source]

Load all hyperparameter trials with their corresponding score. Sorted by creation date.

Returns

Trials (hyperparams, scores)

new_trial(auto_ml_container: neuraxle.metaopt.auto_ml.AutoMLContainer)[source]

Save hyperparams, and score for a failed trial.

Returns

(hyperparams, scores)

save_best_model(step: neuraxle.base.BaseStep)[source]

Save the best model inside the best retrained model folder.

Parameters

step – step to save

Returns

saved step

save_trial(trial: neuraxle.metaopt.trial.Trial)[source]

Save trial.

Parameters

trial – trial to save.

Returns

set_strategy(hyperparameter_selection_strategy: neuraxle.metaopt.auto_ml.BaseHyperparameterSelectionStrategy)[source]

Set hyperparameter selection strategy.

Parameters

hyperparameter_selection_strategy – hyperparameter selection strategy.

Returns

class neuraxle.metaopt.auto_ml.InMemoryHyperparamsRepository(hyperparameter_selection_strategy=None, print_func: Callable = None, cache_folder: str = None, best_retrained_model_folder=None)[source]

In memory hyperparams repository that can print information about trials. Useful for debugging.

Example usage :

InMemoryHyperparamsRepository(
    hyperparameter_selection_strategy=RandomSearchHyperparameterSelectionStrategy(),
    print_func=print,
    cache_folder='cache',
    best_retrained_model_folder='best'
)
load_all_trials(status: Optional[neuraxle.metaopt.trial.TRIAL_STATUS] = None) → neuraxle.metaopt.trial.Trials[source]

Load all trials with the given status.

Parameters

status – trial status

Returns

list of trials

new_trial(auto_ml_container: neuraxle.metaopt.auto_ml.AutoMLContainer) → neuraxle.metaopt.trial.Trial[source]

Create a new trial with the best next hyperparams.

Parameters

auto_ml_container – auto ml data container

Returns

trial

save_trial(trial: neuraxle.metaopt.trial.Trial)[source]

Save trial.

Parameters

trial – trial to save

Returns

class neuraxle.metaopt.auto_ml.KFoldCrossValidationSplitter(k_fold: int)[source]

Create a function that splits data with K-Fold Cross-Validation resampling.

# create a kfold cross validation splitter with 2 kfold
kfold_cross_validation_split(0.20)
Parameters

k_fold – number of folds.

Returns

split(data_inputs, expected_outputs=None) → Tuple[List[T], List[T], List[T], List[T]][source]
class neuraxle.metaopt.auto_ml.RandomSearchHyperparameterSelectionStrategy[source]

AutoML Hyperparameter Optimizer that randomly samples the space of random variables. Please refer to AutoML for a usage example.

find_next_best_hyperparams(auto_ml_container: neuraxle.metaopt.auto_ml.AutoMLContainer) → neuraxle.hyperparams.space.HyperparameterSamples[source]

Randomly sample the next hyperparams to try.

Parameters

auto_ml_container – trials data container

Returns

next best hyperparams

class neuraxle.metaopt.auto_ml.Trainer(epochs, metrics=None, callbacks=None, print_metrics=True, print_func=None)[source]

Example usage :

trainer = Trainer(
    callbacks=[],
    epochs=10,
    print_func=print
)

repo_trial = trainer.fit(
    p=p,
    trial_repository=repo_trial,
    train_data_container=training_data_container,
    validation_data_container=validation_data_container,
    context=context
)

pipeline = trainer.refit(repo_trial.pipeline, data_container, context)
fit_trial_split(trial_split: neuraxle.metaopt.trial.TrialSplit, train_data_container: neuraxle.data_container.DataContainer, validation_data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → neuraxle.metaopt.trial.TrialSplit[source]

Train pipeline using the training data container. Track training, and validation metrics for each epoch.

Parameters
  • train_data_container – train data container

  • validation_data_container – validation data container

  • trial_split – trial to execute

  • context – execution context

Returns

executed trial

get_main_metric_name() → str[source]

Get main metric name.

Returns

refit(p: neuraxle.base.BaseStep, data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → neuraxle.base.BaseStep[source]

Refit the pipeline on the whole dataset (without any validation technique).

Parameters
  • p – trial to refit

  • data_container – data container

  • context – execution context

Returns

fitted pipeline

class neuraxle.metaopt.auto_ml.ValidationSplitter(test_size: float)[source]

Create a function that splits data into a training, and a validation set.

# create a validation splitter function with 80% train, and 20% validation
validation_splitter(0.20)
Parameters

test_size – test size in float

Returns

split(data_inputs, expected_outputs=None) → Tuple[List[T], List[T], List[T], List[T]][source]
neuraxle.metaopt.auto_ml.kfold_cross_validation_split(data_inputs, k_fold)[source]
neuraxle.metaopt.auto_ml.validation_split(test_size: float, data_inputs, expected_outputs=None) → Tuple[List[T], List[T], List[T], List[T]][source]

Split data inputs, and expected outputs into a training set, and a validation set.

Parameters
  • test_size – test size in float

  • data_inputs – data inputs to split

  • expected_outputs – expected outputs to split

Returns

train_data_inputs, train_expected_outputs, validation_data_inputs, validation_expected_outputs