neuraxle.metaopt.random

Random

Meta steps for hyperparameter tuning, such as random search.

Functions

average_kfold_scores(metric_function)

Classes

AnchoredWalkForwardTimeSeriesCrossValidationWrapper(…)

Prform an anchored walk forward cross validation by performing a forward rolling split.

BaseCrossValidationWrapper([wrapped, …])

BaseValidation([wrapped])

Base class For validation wrappers.

KFoldCrossValidationWrapper([…])

ValidationSplitWrapper(wrapped, test_size[, …])

Wrapper for validation split that calculates the score for the validation split.

WalkForwardTimeSeriesCrossValidationWrapper(…)

Perform a classic walk forward cross validation by performing a forward rolling split.

class neuraxle.metaopt.random.AnchoredWalkForwardTimeSeriesCrossValidationWrapper(minimum_training_size, validation_window_size=None, padding_between_training_and_validation=0, drop_remainder=False, scoring_function=<function r2_score>, joiner=NumpyConcatenateOnCustomAxis( name=NumpyConcatenateOnCustomAxis, hyperparameters=HyperparameterSamples() ))[source]

Prform an anchored walk forward cross validation by performing a forward rolling split. All training splits start at the beginning of the time series, but finish at different time. The finish time increase toward the end at each forward split.

For the validation split it will start after a certain time delay (if padding is set) after their corresponding training split.

Notes: The data supported by this cross validation is nd.array of shape [batch_size, total_time_steps, n_features]. The array can have an arbitrary number of dimension, but the time series axis is currently limited to axis=1.

split(data_inputs, expected_outputs)[source]

Split the data into train inputs, train expected outputs, validation inputs, validation expected outputs.

Notes: The data supported by this cross validation is nd.array of shape [batch_size, total_time_steps, n_features]. The array can have an arbitrary number of dimension, but the time series axis is currently limited to axis=1.

Parameters
  • data_inputs – data to perform walk forward cross validation into.

  • expected_outputs – the expected target/label that will be used during walk forward cross validation.

Returns

train_data_inputs, train_expected_outputs, validation_data_inputs, validation_expected_outputs

train_split(data_inputs, expected_outputs=None) -> (typing.List, typing.List)[source]

Split the data into train inputs, train expected outputs

Notes: The data supported by this cross validation is nd.array of shape [batch_size, total_time_steps, n_features]. The array can have an arbitrary number of dimension, but the time series axis is currently limited to axis=1.

Parameters
  • data_inputs – data to perform walk forward cross validation into.

  • expected_outputs – the expected target*label that will be used during walk forward cross validation.

Returns

train_data_inputs, train_expected_outputs

validation_split(data_inputs, expected_outputs=None) → List[T][source]

Split the data into validation inputs, validation expected outputs.

Notes: The data supported by this cross validation is nd.array of shape [batch_size, total_time_steps, n_features]. The array can have an arbitrary number of dimension, but the time series axis is currently limited to axis=1.

Parameters
  • data_inputs – data to perform walk forward cross validation into.

  • expected_outputs – the expected target*label that will be used during walk forward cross validation.

Returns

validation_data_inputs, validation_expected_outputs

class neuraxle.metaopt.random.BaseCrossValidationWrapper(wrapped=None, scoring_function=<function r2_score>, joiner=NumpyConcatenateOuterBatch( name=NumpyConcatenateOuterBatch, hyperparameters=HyperparameterSamples() ), cache_folder_when_no_handle=None, split_data_container_during_fit=True, predict_after_fit=True)[source]
calculate_score(results)[source]
get_score()[source]
get_scores_std()[source]
split(data_inputs, expected_outputs)[source]
split_data_container(data_container: neuraxle.data_container.DataContainer) → Tuple[neuraxle.data_container.DataContainer, neuraxle.data_container.DataContainer][source]
train(train_data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext)[source]
class neuraxle.metaopt.random.BaseValidation(wrapped=None, scoring_function: Callable = <function r2_score>)[source]

Base class For validation wrappers. It has a scoring function to calculate the score for the validation split.

See also

:class`neuraxle.metaopt.random.ValidationSplitWrapper`, :class`Kneuraxle.metaopt.random.FoldCrossValidationWrapper`, :class`neuraxle.metaopt.random.AnchoredWalkForwardTimeSeriesCrossValidationWrapper`, :class`neuraxle.metaopt.random.WalkForwardTimeSeriesCrossValidationWrapper`

split_data_container(data_container) → Tuple[neuraxle.data_container.DataContainer, neuraxle.data_container.DataContainer][source]
class neuraxle.metaopt.random.KFoldCrossValidationWrapper(scoring_function=<function r2_score>, k_fold=3, joiner=NumpyConcatenateOuterBatch( name=NumpyConcatenateOuterBatch, hyperparameters=HyperparameterSamples() ), cache_folder_when_no_handle=None)[source]
split(data_inputs, expected_outputs)[source]
train_split(data_inputs, expected_outputs) -> (typing.List, typing.List)[source]
validation_split(data_inputs, expected_outputs=None) -> (typing.List, typing.List)[source]
class neuraxle.metaopt.random.ValidationSplitWrapper(wrapped: neuraxle.base.BaseStep = None, test_size: float = 0.2, scoring_function=<function r2_score>, run_validation_split_in_test_mode=True, cache_folder_when_no_handle=None)[source]

Wrapper for validation split that calculates the score for the validation split.

random_search = Pipeline([
    RandomSearch(
        ValidationSplitWrapper(
            Identity(),
            test_size=0.1
            scoring_function=mean_absolute_relative_error,
            run_validation_split_in_test_mode=False
        ),
        n_iter= 10,
        higher_score_is_better= True,
        validation_technique=KFoldCrossValidationWrapper(),
        refit=True
    )
])

Note

The data is not shuffled before split. Please refer to the :class`DataShuffler` step for data shuffling.

See also

:class`BaseValidation`, :class`BaseCrossValidationWrapper`, :class`neuraxle.metaopt.auto_ml.RandomSearch`, :class`neuraxle.steps.data.DataShuffler`

disable_metrics()[source]
enable_metrics()[source]
get_score()[source]
get_score_train()[source]
get_score_validation()[source]
split(data_inputs, expected_outputs=None) → Tuple[List[T], List[T], List[T], List[T]][source]

Split data inputs, and expected outputs into a training set, and a validation set.

Parameters
  • data_inputs – data inputs to split

  • expected_outputs – expected outputs to split

Returns

train_data_inputs, train_expected_outputs, validation_data_inputs, validation_expected_outputs

split_data_container(data_container) → Tuple[neuraxle.data_container.DataContainer, neuraxle.data_container.DataContainer][source]

Split data container into a training set, and a validation set.

Parameters

data_container (DataContainer) – data container

Returns

train_data_container, validation_data_container

train_split(data_inputs) → List[T][source]

Split training set.

Parameters

data_inputs – data inputs to split

Returns

train_data_inputs

validation_split(data_inputs) → List[T][source]

Split validation set.

Parameters

data_inputs – data inputs to split

Returns

validation_data_inputs

class neuraxle.metaopt.random.WalkForwardTimeSeriesCrossValidationWrapper(training_window_size, validation_window_size=None, padding_between_training_and_validation=0, drop_remainder=False, scoring_function=<function r2_score>, joiner=NumpyConcatenateOnCustomAxis( name=NumpyConcatenateOnCustomAxis, hyperparameters=HyperparameterSamples() ))[source]

Perform a classic walk forward cross validation by performing a forward rolling split.

All the training split have the same validation_window_size size. The start time and end time of each training split will increase identically toward the end at each forward split. Same principle apply with the validation split, where the start and end will increase in the same manner toward the end. Each validation split start after a certain time delay (if padding is set) after their corresponding training split.

Notes: The data supported by this cross validation is nd.array of shape [batch_size, total_time_steps, n_features]. The array can have an arbitrary number of dimension, but the time series axis is currently limited to axis=1.

neuraxle.metaopt.random.average_kfold_scores(metric_function)[source]