neuraxle.steps.data

Module-level documentation for neuraxle.steps.data. Here is an inheritance diagram, including dependencies to other base modules of Neuraxle:

Inheritance diagram of neuraxle.steps.data

Data Steps

You can find here steps that take action on data.

Classes

DataShuffler([seed, …])

Data Shuffling step that shuffles data inputs, and expected_outputs at the same time.

EpochRepeater(wrapped, epochs[, …])

Repeat wrapped step fit, or transform for the number of epochs passed in the constructor.

InnerConcatenateDataContainer([…])

Concatenate inner features of sub data containers along axis=-1..

TrainShuffled(wrapped[, seed])

ZipBatchDataContainer([sub_data_container_names])

WARNING: Unexpected behaviour from this class.


class neuraxle.steps.data.DataShuffler(seed=None, increment_seed_after_each_fit=True)[source]

Bases: neuraxle.steps.output_handlers.InputAndOutputTransformerMixin, neuraxle.base.BaseTransformer

Data Shuffling step that shuffles data inputs, and expected_outputs at the same time.

p = Pipeline([
    TrainOnlyWrapper(DataShuffler(seed=42, increment_seed_after_each_fit=True, increment_seed_after_each_fit=False)),
    EpochRepeater(ForecastingPipeline(), epochs=EPOCHS, repeat_in_test_mode=False)
])

Warning

You probably always want to wrap this step by a TrainOnlyWrapper

__init__(seed=None, increment_seed_after_each_fit=True)[source]

Initialize self. See help(type(self)) for accurate signature.

transform(data_inputs)[source]

Shuffle data inputs, and expected outputs.

Parameters

data_inputs – (data inputs, expected outputs) tuple to shuffle

Returns

_abc_impl = <_abc_data object>
class neuraxle.steps.data.EpochRepeater(wrapped, epochs, repeat_in_test_mode=False, cache_folder_when_no_handle=None)[source]

Bases: neuraxle.base.ForceHandleOnlyMixin, neuraxle.base.MetaStep

Repeat wrapped step fit, or transform for the number of epochs passed in the constructor.

p = Pipeline([
    TrainOnlyWrapper(DataShuffler(seed=42, increment_seed_after_each_fit=True, increment_seed_after_each_fit=False)),
    EpochRepeater(ForecastingPipeline(), epochs=EPOCHS, repeat_in_test_mode=False)
])
__init__(wrapped, epochs, repeat_in_test_mode=False, cache_folder_when_no_handle=None)[source]

Initialize self. See help(type(self)) for accurate signature.

_fit_transform_data_container(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) -> ('BaseStep', <class 'neuraxle.data_container.DataContainer'>)[source]

Fit transform wrapped step self.epochs times using wrapped step handle fit transform method.

Parameters
Returns

(fitted self, data container)

Return type

(BaseStep, DataContainer)

fit_transform(data_inputs, expected_outputs=None) -> ('BaseStep', typing.Iterable)[source]

Fit transform wrapped step self.epochs times.

Parameters
  • data_inputs – data inputs to fit on

  • expected_outputs – expected_outputs to fit on

Returns

fitted self

_fit_data_container(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → neuraxle.base.BaseStep[source]

Fit wrapped step self.epochs times using wrapped step handle fit method.

Parameters
Returns

(fitted self, data container)

Return type

(BaseStep, DataContainer)

fit(data_inputs, expected_outputs=None) → neuraxle.base.BaseStep[source]

Fit wrapped step self.epochs times.

Parameters
  • data_inputs – data inputs to fit on

  • expected_outputs – expected_outputs to fit on

Returns

fitted self

_should_repeat()[source]
_transform_data_container(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → neuraxle.data_container.DataContainer[source]

Transform data container.

Return type

DataContainer

Parameters
Returns

data container

_get_epochs()[source]
_should_repeat_fit()[source]
_abc_impl = <_abc_data object>
class neuraxle.steps.data.TrainShuffled(wrapped, seed=None)[source]

Bases: neuraxle.pipeline.Pipeline

__init__(wrapped, seed=None)[source]

Initialize self. See help(type(self)) for accurate signature.

_abc_impl = <_abc_data object>
class neuraxle.steps.data.InnerConcatenateDataContainer(sub_data_container_names=None)[source]

Bases: neuraxle.base.ForceHandleOnlyMixin, neuraxle.base.BaseTransformer

Concatenate inner features of sub data containers along axis=-1..

Code example:

data_container = DataContainer(data_inputs=data_inputs_3d, expected_outputs=expected_outputs_3d)
data_container.add_sub_data_container(name='1d_data_source', data_container=data_container_1d)
data_container.add_sub_data_container(name='2d_data_source', data_container=data_container_2d)

# data container with sub data containers :
# DataContainer(data_inputs=data_inputs_3d, expected_outputs=expected_outputs, sub_data_containers=[('1d_data_source', data_container_1d), ('2d_data_source', data_container_2d)])

p = Pipeline([
    InnerConcatenateDataContainer()
    # is equivalent to ZipData(sub_data_container_names=['1d_data_source', '2d_data_source'])
])

data_container = p.handle_transform(data_container, ExecutionContext())

# new_shape: (batch_size, time_steps, n_features + batch_features + 1)
__init__(sub_data_container_names=None)[source]

Initialize self. See help(type(self)) for accurate signature.

_fit_transform_data_container(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) -> ('BaseTransformer', <class 'neuraxle.data_container.DataContainer'>)[source]

Merge sub data containers into the current data container.

Parameters
Returns

base step, data container

Return type

Tuple[BaseTransformer, DataContainer]

_transform_data_container(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → neuraxle.data_container.DataContainer[source]

Merge sub data containers into the current data container.

Parameters
Returns

base step, data container

Return type

DataContainer

_concatenate_sub_data_containers(data_container: neuraxle.data_container.DataContainer) → neuraxle.data_container.DataContainer[source]

Merge sub data containers into the current data container.

Parameters

data_container (DataContainer) – data container to zip

Returns

base step, data container

Return type

DataContainer

_concatenate_sub_data_container(data_container_to_zip: List[neuraxle.data_container.DataContainer]) → neuraxle.data_container.DataContainer[source]

Zip a data container into another data container with a higher dimension.

Return type

DataContainer

Parameters
Returns

concatenated data containers

_abc_impl = <_abc_data object>
class neuraxle.steps.data.ZipBatchDataContainer(sub_data_container_names=None)[source]

Bases: neuraxle.base.ForceHandleOnlyMixin, neuraxle.base.BaseTransformer

WARNING: Unexpected behaviour from this class. It’s not to date.

Concatenate outer batch of sub data containers along axis=0..

Code example:

data_container = DataContainer(data_inputs=data_inputs_3d, expected_outputs=expected_outputs_3d)
data_container.add_sub_data_container(name='1d_data_source', data_container=data_container_1d)
data_container.add_sub_data_container(name='2d_data_source', data_container=data_container_2d)

# data container with sub data containers :
# DataContainer(data_inputs=data_inputs_3d, expected_outputs=expected_outputs, sub_data_containers=[('1d_data_source', data_container_1d), ('2d_data_source', data_container_2d)])

p = Pipeline([
    ZipBatchDataContainer()
    # is equivalent to ZipBatchDataContainer(sub_data_container_names=['2d_data_source'])
])

data_container = p.handle_transform(data_container, ExecutionContext())

# new_shape: (batch_size, ((time_steps, n_features_3d), n_features_2d))
__init__(sub_data_container_names=None)[source]

Initialize self. See help(type(self)) for accurate signature.

_transform_data_container(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → neuraxle.data_container.DataContainer[source]

Merge sub data containers into the current data container.

Parameters
Returns

base step, data container

Return type

DataContainer

_batch_zip_sub_data_containers(data_container: neuraxle.data_container.DataContainer)[source]

Zip sub data containers on the batch dimension.

Parameters

data_container (DataContainer) – data container to zip

Returns

base step, data container

Return type

DataContainer

_batch_zip_sub_data_container(data_container, data_container_to_zip) → neuraxle.data_container.DataContainer[source]

Zip sub data container on the batch dimension.

Return type

DataContainer

Parameters
Returns

concatenated data containers

_abc_impl = <_abc_data object>