neuraxle.metaopt.random

Module-level documentation for neuraxle.metaopt.random. Here is an inheritance diagram, including dependencies to other base modules of Neuraxle:

Inheritance diagram of neuraxle.metaopt.random

Random

Meta steps for hyperparameter tuning, such as random search.

Functions

average_kfold_scores(metric_function)

Classes

AnchoredWalkForwardTimeSeriesCrossValidationWrapper(…)

Perform an anchored walk forward cross validation by performing a forward rolling split.

BaseCrossValidationWrapper([wrapped, …])

BaseValidation([wrapped])

Base class For validation wrappers.

KFoldCrossValidationWrapper([…])

ValidationSplitWrapper(wrapped, test_size[, …])

Wrapper for validation split that calculates the score for the validation split.

WalkForwardTimeSeriesCrossValidationWrapper(…)

Perform a classic walk forward cross validation by performing a forward rolling split.


class neuraxle.metaopt.random.BaseValidation(wrapped=None, scoring_function: Callable = <function r2_score>)[source]

Bases: neuraxle.base.MetaStep, abc.ABC

Base class For validation wrappers. It has a scoring function to calculate the score for the validation split.

See also

:class`neuraxle.metaopt.random.ValidationSplitWrapper`, :class`Kneuraxle.metaopt.random.FoldCrossValidationWrapper`, :class`neuraxle.metaopt.random.AnchoredWalkForwardTimeSeriesCrossValidationWrapper`, :class`neuraxle.metaopt.random.WalkForwardTimeSeriesCrossValidationWrapper`

__init__(wrapped=None, scoring_function: Callable = <function r2_score>)[source]

Base class For validation wrappers. It has a scoring function to calculate the score for the validation split.

Parameters

scoring_function (Callable) – scoring function with two arguments (y_true, y_pred)

split_data_container(data_container) → Tuple[neuraxle.data_container.DataContainer, neuraxle.data_container.DataContainer][source]
_abc_impl = <_abc_data object>
class neuraxle.metaopt.random.BaseCrossValidationWrapper(wrapped=None, scoring_function=<function r2_score>, joiner=NumpyConcatenateOuterBatch(name='NumpyConcatenateOuterBatch', hyperparameters=HyperparameterSamples()), cache_folder_when_no_handle=None, split_data_container_during_fit=True, predict_after_fit=True)[source]

Bases: neuraxle.base.EvaluableStepMixin, neuraxle.base.ForceHandleOnlyMixin, neuraxle.metaopt.random.BaseValidation, abc.ABC

__init__(wrapped=None, scoring_function=<function r2_score>, joiner=NumpyConcatenateOuterBatch(name='NumpyConcatenateOuterBatch', hyperparameters=HyperparameterSamples()), cache_folder_when_no_handle=None, split_data_container_during_fit=True, predict_after_fit=True)[source]

Base class For validation wrappers. It has a scoring function to calculate the score for the validation split.

Parameters

scoring_function (Callable) – scoring function with two arguments (y_true, y_pred)

train(train_data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext)[source]
_fit_data_container(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → neuraxle.base.BaseStep[source]

Fit data container.

Return type

BaseStep

Parameters
Returns

(fitted self, data container)

calculate_score(results)[source]
split_data_container(data_container: neuraxle.data_container.DataContainer) → Tuple[neuraxle.data_container.DataContainer, neuraxle.data_container.DataContainer][source]
get_score()[source]
get_scores_std()[source]
split(data_inputs, expected_outputs)[source]
_abc_impl = <_abc_data object>
class neuraxle.metaopt.random.ValidationSplitWrapper(wrapped: neuraxle.base.BaseStep = None, test_size: float = 0.2, scoring_function=<function r2_score>, run_validation_split_in_test_mode=True, cache_folder_when_no_handle=None)[source]

Bases: neuraxle.metaopt.random.BaseCrossValidationWrapper

Wrapper for validation split that calculates the score for the validation split.

random_search = Pipeline([
    RandomSearch(
        ValidationSplitWrapper(
            Identity(),
            test_size=0.1
            scoring_function=mean_absolute_relative_error,
            run_validation_split_in_test_mode=False
        ),
        n_iter= 10,
        higher_score_is_better= True,
        validation_technique=KFoldCrossValidationWrapper(),
        refit=True
    )
])

Note

The data is not shuffled before split. Please refer to the :class`DataShuffler` step for data shuffling.

See also

:class`BaseValidation`, :class`BaseCrossValidationWrapper`, :class`neuraxle.metaopt.auto_ml.RandomSearch`, :class`neuraxle.steps.data.DataShuffler`

__init__(wrapped: neuraxle.base.BaseStep = None, test_size: float = 0.2, scoring_function=<function r2_score>, run_validation_split_in_test_mode=True, cache_folder_when_no_handle=None)[source]
Parameters
  • wrapped (BaseStep) – wrapped step

  • test_size (float) – ratio for test size between 0 and 1

  • scoring_function – scoring function with two arguments (y_true, y_pred)

_fit_data_container(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) -> ('ValidationSplitWrapper', <class 'neuraxle.data_container.DataContainer'>)[source]

Fit using the training split. Calculate the scores using the validation split.

Parameters
Returns

fitted self

_fit_transform_data_container(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) -> ('BaseStep', <class 'neuraxle.data_container.DataContainer'>)[source]

Fit Transform given data inputs without splitting.

Parameters
Returns

outputs

_transform_data_container(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext)[source]

Transform given data inputs without splitting.

Parameters
Returns

outputs

_update_scores_validation(data_inputs, expected_outputs)[source]
_update_scores_train(data_inputs, expected_outputs)[source]
get_score()[source]
get_score_validation()[source]
get_score_train()[source]
split_data_container(data_container) → Tuple[neuraxle.data_container.DataContainer, neuraxle.data_container.DataContainer][source]

Split data container into a training set, and a validation set.

Parameters

data_container (DataContainer) – data container

Returns

train_data_container, validation_data_container

split(data_inputs, expected_outputs=None) → Tuple[List[T], List[T], List[T], List[T]][source]

Split data inputs, and expected outputs into a training set, and a validation set.

Parameters
  • data_inputs – data inputs to split

  • expected_outputs – expected outputs to split

Returns

train_data_inputs, train_expected_outputs, validation_data_inputs, validation_expected_outputs

train_split(data_inputs) → List[T][source]

Split training set.

Parameters

data_inputs – data inputs to split

Returns

train_data_inputs

validation_split(data_inputs) → List[T][source]

Split validation set.

Parameters

data_inputs – data inputs to split

Returns

validation_data_inputs

disable_metrics()[source]
enable_metrics()[source]
_get_index_split(data_inputs)[source]
_abc_impl = <_abc_data object>
neuraxle.metaopt.random.average_kfold_scores(metric_function)[source]
class neuraxle.metaopt.random.KFoldCrossValidationWrapper(scoring_function=<function r2_score>, k_fold=3, joiner=NumpyConcatenateOuterBatch(name='NumpyConcatenateOuterBatch', hyperparameters=HyperparameterSamples()), cache_folder_when_no_handle=None)[source]

Bases: neuraxle.metaopt.random.BaseCrossValidationWrapper

__init__(scoring_function=<function r2_score>, k_fold=3, joiner=NumpyConcatenateOuterBatch(name='NumpyConcatenateOuterBatch', hyperparameters=HyperparameterSamples()), cache_folder_when_no_handle=None)[source]

Base class For validation wrappers. It has a scoring function to calculate the score for the validation split.

Parameters

scoring_function (Callable) – scoring function with two arguments (y_true, y_pred)

split(data_inputs, expected_outputs)[source]
train_split(data_inputs, expected_outputs) -> (typing.List, typing.List)[source]
validation_split(data_inputs, expected_outputs=None) -> (typing.List, typing.List)[source]
_split(data_inputs)[source]
_abc_impl = <_abc_data object>
class neuraxle.metaopt.random.AnchoredWalkForwardTimeSeriesCrossValidationWrapper(minimum_training_size, validation_window_size=None, padding_between_training_and_validation=0, drop_remainder=False, scoring_function=<function r2_score>, joiner=NumpyConcatenateInnerFeatures(name='NumpyConcatenateInnerFeatures', hyperparameters=HyperparameterSamples()))[source]

Bases: neuraxle.metaopt.random.BaseCrossValidationWrapper

Perform an anchored walk forward cross validation by performing a forward rolling split. All training splits start at the beginning of the time series, but finish at different time. The finish time increase toward the end at each forward split.

For the validation split it will start after a certain time delay (if padding is set) after their corresponding training split.

Notes: The data supported by this cross validation is nd.array of shape [batch_size, total_time_steps, n_features]. The array can have an arbitrary number of dimension, but the time series axis is currently limited to axis=1.

__init__(minimum_training_size, validation_window_size=None, padding_between_training_and_validation=0, drop_remainder=False, scoring_function=<function r2_score>, joiner=NumpyConcatenateInnerFeatures(name='NumpyConcatenateInnerFeatures', hyperparameters=HyperparameterSamples()))[source]

Create a anchored walk forward time series cross validation object.

The size of the validation split is defined by validation_window_size. The difference in start position between two consecutive validation split is also equal to validation_window_size.

Parameters
  • minimum_training_size – size of the smallest training split.

  • validation_window_size – size of each validation split and also the time step taken between each forward roll, by default None. If None : It takes the value minimum_training_size.

  • padding_between_training_and_validation (int) – the size of the padding between the end of the training split and the start of the validation split, by default 0.

  • drop_remainder (bool) – drop the last split if the last validation split does not coincide with a full validation_window_size, by default False.

  • scoring_function – scoring function use to validate performance if it is not None, by default r2_score,

:param joiner the joiner callable that can join the different result together. :return: WalkForwardTimeSeriesCrossValidation instance.

split(data_inputs, expected_outputs)[source]

Split the data into train inputs, train expected outputs, validation inputs, validation expected outputs.

Notes: The data supported by this cross validation is nd.array of shape [batch_size, total_time_steps, n_features]. The array can have an arbitrary number of dimension, but the time series axis is currently limited to axis=1.

Parameters
  • data_inputs – data to perform walk forward cross validation into.

  • expected_outputs – the expected target/label that will be used during walk forward cross validation.

Returns

train_data_inputs, train_expected_outputs, validation_data_inputs, validation_expected_outputs

train_split(data_inputs, expected_outputs=None) -> (typing.List, typing.List)[source]

Split the data into train inputs, train expected outputs

Notes: The data supported by this cross validation is nd.array of shape [batch_size, total_time_steps, n_features]. The array can have an arbitrary number of dimension, but the time series axis is currently limited to axis=1.

Parameters
  • data_inputs – data to perform walk forward cross validation into.

  • expected_outputs – the expected target*label that will be used during walk forward cross validation.

Returns

train_data_inputs, train_expected_outputs

validation_split(data_inputs, expected_outputs=None) → List[T][source]

Split the data into validation inputs, validation expected outputs.

Notes: The data supported by this cross validation is nd.array of shape [batch_size, total_time_steps, n_features]. The array can have an arbitrary number of dimension, but the time series axis is currently limited to axis=1.

Parameters
  • data_inputs – data to perform walk forward cross validation into.

  • expected_outputs – the expected target*label that will be used during walk forward cross validation.

Returns

validation_data_inputs, validation_expected_outputs

_train_split(data_inputs)[source]
_validation_split(data_inputs)[source]
_get_number_fold(data_inputs)[source]
_abc_impl = <_abc_data object>
class neuraxle.metaopt.random.WalkForwardTimeSeriesCrossValidationWrapper(training_window_size, validation_window_size=None, padding_between_training_and_validation=0, drop_remainder=False, scoring_function=<function r2_score>, joiner=NumpyConcatenateOnAxis(name='NumpyConcatenateOnAxis', hyperparameters=HyperparameterSamples()))[source]

Bases: neuraxle.metaopt.random.AnchoredWalkForwardTimeSeriesCrossValidationWrapper

Perform a classic walk forward cross validation by performing a forward rolling split.

All the training split have the same validation_window_size size. The start time and end time of each training split will increase identically toward the end at each forward split. Same principle apply with the validation split, where the start and end will increase in the same manner toward the end. Each validation split start after a certain time delay (if padding is set) after their corresponding training split.

Notes: The data supported by this cross validation is nd.array of shape [batch_size, total_time_steps, n_features]. The array can have an arbitrary number of dimension, but the time series axis is currently limited to axis=1.

__init__(training_window_size, validation_window_size=None, padding_between_training_and_validation=0, drop_remainder=False, scoring_function=<function r2_score>, joiner=NumpyConcatenateOnAxis(name='NumpyConcatenateOnAxis', hyperparameters=HyperparameterSamples()))[source]

Create a classic walk forward time series cross validation object.

The difference in start position between two consecutive validation split are equal to one validation_window_size.

Parameters
  • training_window_size – the window size of training split.

  • validation_window_size – the window size of each validation split and also the time step taken between each forward roll, by default None. If None : It takes the value training_window_size.

  • padding_between_training_and_validation (int) – the size of the padding between the end of the training split and the start of the validation split, by default 0.

  • drop_remainder (bool) – drop the last split if the last validation split does not coincide with a full validation_window_size, by default False.

  • scoring_function – scoring function use to validate performance if it is not None, by default r2_score,

:param joiner the joiner callable that can join the different result together. :return: WalkForwardTimeSeriesCrossValidation instance.

_train_split(data_inputs)[source]
_abc_impl = <_abc_data object>