neuraxle.steps.data

Data Steps

You can find here steps that take action on data.

Classes

DataShuffler([seed, …])

Data Shuffling step that shuffles data inputs, and expected_outputs at the same time.

EpochRepeater(wrapped, epochs[, …])

Repeat wrapped step fit, or transform for the number of epochs passed in the constructor.

InnerConcatenateDataContainer([…])

Concatenate inner features of sub data containers along axis=-1..

TrainShuffled(wrapped[, seed])

ZipBatchDataContainer([sub_data_container_names])

Concatenate outer batch of sub data containers along axis=0..

class neuraxle.steps.data.DataShuffler(seed=None, increment_seed_after_each_fit=True)[source]

Data Shuffling step that shuffles data inputs, and expected_outputs at the same time.

p = Pipeline([
    TrainOnlyWrapper(DataShuffler(seed=42, increment_seed_after_each_fit=True, increment_seed_after_each_fit=False)),
    EpochRepeater(ForecastingPipeline(), epochs=EPOCHS, repeat_in_test_mode=False)
])

Warning

You probably always want to wrap this step by a TrainOnlyWrapper

transform(data_inputs)[source]

Shuffle data inputs, and expected outputs.

Parameters

data_inputs – (data inputs, expected outputs) tuple to shuffle

Returns

class neuraxle.steps.data.EpochRepeater(wrapped, epochs, repeat_in_test_mode=False, cache_folder_when_no_handle=None)[source]

Repeat wrapped step fit, or transform for the number of epochs passed in the constructor.

p = Pipeline([
    TrainOnlyWrapper(DataShuffler(seed=42, increment_seed_after_each_fit=True, increment_seed_after_each_fit=False)),
    EpochRepeater(ForecastingPipeline(), epochs=EPOCHS, repeat_in_test_mode=False)
])
fit(data_inputs, expected_outputs=None) → neuraxle.base.BaseStep[source]

Fit wrapped step self.epochs times.

Parameters
  • data_inputs – data inputs to fit on

  • expected_outputs – expected_outputs to fit on

Returns

fitted self

fit_transform(data_inputs, expected_outputs=None) -> ('BaseStep', typing.Iterable)[source]

Fit transform wrapped step self.epochs times.

Parameters
  • data_inputs – data inputs to fit on

  • expected_outputs – expected_outputs to fit on

Returns

fitted self

class neuraxle.steps.data.InnerConcatenateDataContainer(sub_data_container_names=None)[source]

Concatenate inner features of sub data containers along axis=-1..

Code example:

data_container = DataContainer(data_inputs=data_inputs_3d, expected_outputs=expected_outputs_3d)
data_container.add_sub_data_container(name='1d_data_source', data_container=data_container_1d)
data_container.add_sub_data_container(name='2d_data_source', data_container=data_container_2d)

# data container with sub data containers :
# DataContainer(data_inputs=data_inputs_3d, expected_outputs=expected_outputs, sub_data_containers=[('1d_data_source', data_container_1d), ('2d_data_source', data_container_2d)])

p = Pipeline([
    InnerConcatenateDataContainer()
    # is equivalent to ZipData(sub_data_container_names=['1d_data_source', '2d_data_source'])
])

data_container = p.handle_transform(data_container, ExecutionContext())

# new_shape: (batch_size, time_steps, n_features + batch_features + 1)
class neuraxle.steps.data.TrainShuffled(wrapped, seed=None)[source]
class neuraxle.steps.data.ZipBatchDataContainer(sub_data_container_names=None)[source]

Concatenate outer batch of sub data containers along axis=0..

Code example:

data_container = DataContainer(data_inputs=data_inputs_3d, expected_outputs=expected_outputs_3d)
data_container.add_sub_data_container(name='1d_data_source', data_container=data_container_1d)
data_container.add_sub_data_container(name='2d_data_source', data_container=data_container_2d)

# data container with sub data containers :
# DataContainer(data_inputs=data_inputs_3d, expected_outputs=expected_outputs, sub_data_containers=[('1d_data_source', data_container_1d), ('2d_data_source', data_container_2d)])

p = Pipeline([
    ZipBatchDataContainer()
    # is equivalent to ZipBatchDataContainer(sub_data_container_names=['2d_data_source'])
])

data_container = p.handle_transform(data_container, ExecutionContext())

# new_shape: (batch_size, ((time_steps, n_features_3d), n_features_2d))