neuraxle.base

Neuraxle’s Base Classes

This is the core of Neuraxle. Most pipeline steps derive (inherit) from those classes. They are worth noticing.

Classes

BaseHasher

Base class to hash hyperparamters, and data input ids together.

BaseSaver

Any saver must inherit from this one.

BaseStep(hyperparams, hyperparams_space, …)

Base class for a pipeline step.

ExecutionContext(root, execution_mode, …)

Execution context object containing all of the pipeline hierarchy steps.

ExecutionMode

An enumeration.

ForceAlwaysHandleMixin

A pipeline step that requires the implementation only of handler methods :

HashlibMd5Hasher

Class to hash hyperparamters, and data input ids together using md5 algorithm from hashlib : https://docs.python.org/3/library/hashlib.html

Identity([savers, name])

A pipeline step that has no effect at all but to return the same data without changes.

JoblibStepSaver

Saver that can save, or load a step with joblib.load, and joblib.dump.

MetaStepMixin(wrapped)

A class to represent a step that wraps another step.

NonFittableMixin

A pipeline step that requires no fitting: fitting just returns self when called to do no action.

NonTransformableMixin

A pipeline step that has no effect at all but to return the same data without changes.

ResumableStepMixin

Mixin to add resumable function to a step, or a class that can be resumed, for example a checkpoint on disk.

TruncableJoblibStepSaver()

Step saver for a TruncableSteps.

TruncableSteps(steps_as_tuple, BaseStep], …)

Step that contains multiple steps.

class neuraxle.base.BaseHasher[source]

Base class to hash hyperparamters, and data input ids together. The DataContainer class uses the hashed values for its current ids. BaseStep uses many BaseHasher objects to hash hyperparameters, and data inputs ids together after each transform.

See also

DataContainer

hash(current_ids: List[str], hyperparameters: neuraxle.hyperparams.space.HyperparameterSamples, data_inputs: Iterable) → List[str][source]

Hash DataContainer.current_ids, data inputs, and hyperparameters together.

Parameters
  • current_ids (List[str]) – current hashed ids (can be None if this function has not been called yet)

  • hyperparameters (HyperparameterSamples) – step hyperparameters to hash with current ids

  • data_inputs (Iterable) – data inputs to hash current ids for

Returns

the new hashed current ids

Return type

List[str]

single_hash(current_id: str, hyperparameters: neuraxle.hyperparams.space.HyperparameterSamples) → List[str][source]

Hash summary id, and hyperparameters together.

Parameters
  • current_id – current hashed id

  • hyperparameters (HyperparameterSamples) – step hyperparameters to hash with current ids

Returns

the new hashed current id

Return type

str

class neuraxle.base.BaseSaver[source]

Any saver must inherit from this one. Some savers just save parts of objects, some save it all or what remains. Each :class`BaseStep` can potentially have multiple savers to make serialization possible.

See also

save(), load()

can_load(step: neuraxle.base.BaseStep, context: neuraxle.base.ExecutionContext)[source]

Returns true if we can load the given step with the given execution context.

Parameters
Returns

load_step(step: neuraxle.base.BaseStep, context: neuraxle.base.ExecutionContext) → neuraxle.base.BaseStep[source]

Load step with execution context.

Parameters
  • step – step to load

  • context – execution context to load from

Returns

loaded base step

save_step(step: neuraxle.base.BaseStep, context: neuraxle.base.ExecutionContext) → neuraxle.base.BaseStep[source]

Save step with execution context.

Parameters
Returns

class neuraxle.base.BaseStep(hyperparams: neuraxle.hyperparams.space.HyperparameterSamples = None, hyperparams_space: neuraxle.hyperparams.space.HyperparameterSpace = None, name: str = None, savers: List[neuraxle.base.BaseSaver] = None, hashers: List[neuraxle.base.BaseHasher] = None)[source]

Base class for a pipeline step.

Every step must implement :

If a step is not fittable, you can inherit from NonFittableMixin. If a step is not transformable, you can inherit from NonTransformableMixin. A step should only change its state inside fit() or fit_transform().

Example usage :

class MultiplyByN(NonFittableMixin, BaseStep):
    def __init__(self, multiply_by):
        NonFittableMixin.__init__(self)
        BaseStep.__init__(
            self,
            hyperparams=HyperparameterSamples({
                'multiply_by': multiply_by
            })
        )

    def transform(self, data_inputs):
        return data_inputs * self.hyperparams['multiply_by']

Every step can be saved using its savers of type BaseSaver. Some savers just save parts of objects, some save it all or what remains. Most step hash data inputs with hyperparams after every transformations to update the current ids inside the DataContainer.

Every step has handle methods that can be overridden to add side effects or change the execution flow based on the execution context, and the data container :

Every step has hyperparemeters, and hyperparameters spaces that can be set before the learning process begins. Hyperparameters can not only be passed in the constructor, but also be set by the pipeline that contains all of the steps :

pipeline = Pipeline([
    SomeStep()
])

pipeline.set_hyperparams(HyperparameterSamples({
    'learning_rate': 0.1,
    'SomeStep__learning_rate': 0.05
}))

Note

All heavy initialization logic should be done inside the setup method (e.g.: things inside GPU), and NOT in the constructor.

apply(method_name: str, *kargs, **kwargs) → neuraxle.base.BaseStep[source]

Apply a method to a step and its children.

Parameters
  • method_name – method name that need to be called on all steps

  • kargs – any additional arguments to be passed to the method

  • kwargs – any additional positional arguments to be passed to the method

Returns

self (not a new step)

Return type

BaseStep

apply_method(method: Callable, *kargs, **kwargs) → neuraxle.base.BaseStep[source]

Apply a method to a step and its children.

Parameters
  • method – method to call with self

  • kargs – any additional arguments to be passed to the method

  • kwargs – any additional positional arguments to be passed to the method

Returns

self (not a new step)

Return type

BaseStep

fit(data_inputs, expected_outputs=None) → neuraxle.base.BaseStep[source]

Fit step with the given data inputs, and expected outputs.

Parameters
  • data_inputs – data inputs

  • expected_outputs – expected outputs to fit on

Returns

fitted self

Return type

BaseStep

fit_transform(data_inputs, expected_outputs=None) -> ('BaseStep', typing.Any)[source]

Fit, and transform step with the given data inputs, and expected outputs.

Parameters
  • data_inputs – data inputs

  • expected_outputs – expected outputs to fit on

Returns

(fitted self, tranformed data inputs)

Return type

Tuple[BaseStep, Any]

get_hyperparams() → neuraxle.hyperparams.space.HyperparameterSamples[source]

Get step hyperparameters as neuraxle.hyperparams.space.HyperparameterSamples.

Returns

step hyperparameters

Return type

HyperparameterSamples

get_hyperparams_space() → neuraxle.hyperparams.space.HyperparameterSpace[source]

Get step hyperparameters space.

Example :

step.get_hyperparams_space()
Returns

step hyperparams space

Return type

HyperparameterSpace

get_name() → str[source]

Get the name of the pipeline step.

Returns

the name, a string.

Return type

str

Note

A step name is the same value as the one in the keys of Pipeline.steps_as_tuple

get_params() → dict[source]

Get step hyperparameters as a flat primitive dict.

Example :

s.set_params(learning_rate=0.1)
hyperparams = s.get_params()
assert hyperparams == {"learning_rate": 0.1}
Returns

hyperparameters

Return type

dict

get_savers() → List[neuraxle.base.BaseSaver][source]

Get the step savers of a pipeline step.

Returns

step savers

Return type

List[BaseSaver]

See also

BaseSaver

handle_fit(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → neuraxle.base.BaseStep[source]

Override this to add side effects or change the execution flow before (or after) calling fit(). The default behavior is to rehash current ids with the step hyperparameters.

Parameters
  • data_container – the data container to transform

  • context – execution context

Returns

tuple(fitted pipeline, data_container)

See also

DataContainer, neuraxle.pipeline.Pipeline

handle_fit_transform(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) -> ('BaseStep', <class 'neuraxle.data_container.DataContainer'>)[source]

Override this to add side effects or change the execution flow before (or after) calling * fit_transform(). The default behavior is to rehash current ids with the step hyperparameters.

Parameters
  • data_container – the data container to transform

  • context – execution context

Returns

tuple(fitted pipeline, data_container)

handle_inverse_transform(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → neuraxle.data_container.DataContainer[source]

Override this to add side effects or change the execution flow before (or after) calling inverse_transform(). The default behavior is to rehash current ids with the step hyperparameters.

Parameters
  • data_container – the data container to inverse transform

  • context – execution context

Returns

data_container

See also

DataContainer, neuraxle.pipeline.Pipeline

handle_transform(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → neuraxle.data_container.DataContainer[source]

Override this to add side effects or change the execution flow before (or after) calling * transform(). The default behavior is to rehash current ids with the step hyperparameters.

Parameters
  • data_container – the data container to transform

  • context – execution context

Returns

transformed data container

hash(data_container: neuraxle.data_container.DataContainer) → List[str][source]

Hash data inputs, current ids, and hyperparameters together using self.hashers. This is used to create unique ids for the data checkpoints.

Parameters

data_container (DataContainer) – data container

Returns

hashed current ids

Return type

List[str]

hash_data_container(data_container)[source]

Hash data container using self.hashers.

  1. Hash current ids with hyperparams.

  2. Hash summary id with hyperparams.

Parameters

data_container (DataContainer) – the data container to transform

Returns

transformed data container

Return type

DataContainer

inverse_transform(processed_outputs)[source]

Inverse Transform the given transformed data inputs.

mutate() or reverse() can be called to change the default transform behavior :

p = Pipeline([MultiplyBy()])

_in = np.array([1, 2])

_out = p.transform(_in)

_regenerated_in = reversed(p).transform(_out)

assert np.array_equal(_regenerated_in, _in)
Parameters

processed_outputs – processed data inputs

Returns

inverse transformed processed outputs

Return type

Any

load(context: neuraxle.base.ExecutionContext) → neuraxle.base.BaseStep[source]

Load step using the execution context to create the directory of the saved step. Warning:

Parameters

context – execution context to load step from

Returns

Warning

Please do not override this method because on loading it is an identity step that will load whatever step you coded.

meta_fit(X_train, y_train, metastep: neuraxle.base.MetaStepMixin)[source]

Uses a meta optimization technique (AutoML) to find the best hyperparameters in the given hyperparameter space.

Usage: p = p.meta_fit(X_train, y_train, metastep=RandomSearch(n_iter=10, scoring_function=r2_score, higher_score_is_better=True))

Call .mutate(new_method="inverse_transform", method_to_assign_to="transform"), and the current estimator will become

Parameters
  • X_train – data_inputs.

  • y_train – expected_outputs.

  • metastep – a metastep, that is, a step that can sift through the hyperparameter space of another estimator.

Returns

your best self.

mutate(new_method='inverse_transform', method_to_assign_to='transform', warn=True) → neuraxle.base.BaseStep[source]

Replace the “method_to_assign_to” method by the “new_method” method, IF the present object has no pending calls to .will_mutate_to() waiting to be applied. If there is a pending call, the pending call will override the methods specified in the present call. If the change fails (such as if the new_method doesn’t exist), then a warning is printed (optional). By default, there is no pending will_mutate_to call.

This could for example be useful within a pipeline to apply inverse_transform to every pipeline steps, or to assign predict_probas to predict, or to assign “inverse_transform” to “transform” to a reversed pipeline.

Parameters
  • new_method – the method to replace transform with, if there is no pending will_mutate_to call.

  • method_to_assign_to – the method to which the new method will be assigned to, if there is no pending will_mutate_to call.

  • warn – (verbose) wheter or not to warn about the inexistence of the method.

Returns

self, a copy of self, or even perhaps a new or different BaseStep object.

predict(data_input)[source]

Predict data input expected output using transform method. This is simply a shorthand method that does the same thing as func:~.transform.

Parameters

data_input – data input to predict

Returns

prediction

Return type

Any

reverse() → neuraxle.base.BaseStep[source]

The object will mutate itself such that the .transform method (and of all its underlying objects if applicable) be replaced by the .inverse_transform method.

Note: the reverse may fail if there is a pending mutate that was set earlier with .will_mutate_to.

Returns

a copy of self, reversed. Each contained object will also have been reversed if self is a pipeline.

See also

__reversed__(), inverse_transform()

save(context: neuraxle.base.ExecutionContext) → neuraxle.base.BaseStep[source]

Save step using the execution context to create the directory to save the step into. The saving happens by looping through all of the step savers in the reversed order.

Some savers just save parts of objects, some save it all or what remains. The ExecutionContext.stripped_saver has to be called last because it needs a stripped version of the step.

Parameters

context (ExecutionContext) – context to save from

Returns

self

Return type

BaseStep

set_hyperparams(hyperparams: neuraxle.hyperparams.space.HyperparameterSamples) → neuraxle.base.BaseStep[source]

Set the step hyperparameters.

Example :

step.set_hyperparams(HyperparameterSamples({
    'learning_rate': 0.10
}))
Parameters

hyperparams – hyperparameters

Returns

self

Return type

BaseStep

set_hyperparams_space(hyperparams_space: neuraxle.hyperparams.space.HyperparameterSpace) → neuraxle.base.BaseStep[source]

Set step hyperparameters space.

Example :

step.set_hyperparams_space(HyperparameterSpace({
    'hp': RandInt(0, 10)
}))
Parameters

hyperparams_space (HyperparameterSpace) – hyperparameters space

Returns

self

Return type

BaseStep

set_name(name: str)[source]

Set the name of the pipeline step.

Parameters

name (str) – a string.

Returns

self

Note

A step name is the same value as the one in the keys of ~neuraxle.pipeline.Pipeline.steps_as_tuple

set_params(**params) → neuraxle.base.BaseStep[source]

Set step hyperparameters with a dictionary.

Example :

s.set_params(learning_rate=0.1)
hyperparams = s.get_params()
assert hyperparams == {"learning_rate": 0.1}
Parameters

**params

arbitrary number of arguments for hyperparameters

Return type

BaseStep

set_savers(savers: List[neuraxle.base.BaseSaver]) → neuraxle.base.BaseStep[source]

Set the step savers of a pipeline step.

Returns

self

Return type

BaseStep

See also

BaseSaver

set_train(is_train: bool = True)[source]

This method overrides the method of BaseStep to also consider the wrapped step as well as self. Set pipeline step mode to train or test.

Parameters

is_train (bool) – is training mode or not

Returns

setup() → neuraxle.base.BaseStep[source]

Initialize the step before it runs. Only from here and not before that heavy things should be created (e.g.: things inside GPU), and NOT in the constructor.

The setup method is called for each step before any fit, or fit_transform.

Returns

self

Return type

BaseStep

should_save() → bool[source]

Returns true if the step should be saved. If the step has been initialized and invalidated, then it must be saved.

A step is invalidated when any of the following things happen :
  • a mutation has been performed on the step : func:~.mutate

  • an hyperparameter has changed func:~.set_hyperparams

  • an hyperparameter space has changed func:~.set_hyperparams_space

  • a call to the fit method func:~.handle_fit

  • a call to the fit_transform method func:~.handle_fit_transform

  • the step name has changed func:~neuraxle.base.BaseStep.set_name

Returns

if the step should be saved

Return type

bool

summary_hash(data_container: neuraxle.data_container.DataContainer) → str[source]

Hash data inputs, current ids, and hyperparameters together using self.hashers. This is used to create unique ids for the data checkpoints.

Parameters

data_container (DataContainer) – data container

Returns

hashed current ids

Return type

List[str]

teardown() → neuraxle.base.BaseStep[source]

Teardown step after program execution. Inverse of setup, and it should clear memory. Override this method if you need to clear memory.

Returns

self

Return type

BaseStep

tosklearn()[source]
transform(data_inputs)[source]

Transform given data inputs.

Parameters

data_inputs – data inputs

Returns

transformed data inputs

Return type

Any

update_hyperparams(hyperparams: neuraxle.hyperparams.space.HyperparameterSamples) → neuraxle.base.BaseStep[source]

Update the step hyperparameters without removing the already-set hyperparameters. This can be useful to add more hyperparameters to the existing ones without flushing the ones that were already set.

Example :

step.set_hyperparams(HyperparameterSamples({
    'learning_rate': 0.10
    'weight_decay': 0.001
}))

step.update_hyperparams(HyperparameterSamples({
    'learning_rate': 0.01
}))

assert step.get_hyperparams()['learning_rate'] == 0.01
assert step.get_hyperparams()['weight_decay'] == 0.001
Parameters

hyperparams (HyperparameterSamples) – hyperparameters

Returns

self

Return type

BaseStep

will_mutate_to(new_base_step: Optional[neuraxle.base.BaseStep] = None, new_method: str = None, method_to_assign_to: str = None) → neuraxle.base.BaseStep[source]

This will change the behavior of self.mutate(<...>) such that when mutating, it will return the presently provided new_base_step BaseStep (can be left to None for self), and the .mutate method will also apply the new_method and the method_to_affect, if they are not None, and after changing the object to new_base_step.

This can be useful if your pipeline requires unsupervised pretraining. For example:

X_pretrain = ...
X_train = ...

p = Pipeline(
    SomePreprocessing(),
    SomePretrainingStep().will_mutate_to(new_base_step=SomeStepThatWillUseThePretrainingStep),
    Identity().will_mutate_to(new_base_step=ClassifierThatWillBeUsedOnlyAfterThePretraining)
)
# Pre-train the pipeline
p = p.fit(X_pretrain, y=None)

# This will leave `SomePreprocessing()` untouched and will affect the two other steps.
p = p.mutate(new_method="transform", method_to_affect="transform")

# Pre-train the pipeline
p = p.fit(X_train, y_train)  # Then fit the classifier and other new things
Parameters
  • new_base_step (BaseStep) – if it is not None, upon calling mutate, the object it will mutate to will be this provided new_base_step.

  • method_to_assign_to (str) – if it is not None, upon calling mutate, the method_to_affect will be the one that is used on the provided new_base_step.

  • new_method (str) – if it is not None, upon calling mutate, the new_method will be the one that is used on the provided new_base_step.

Returns

self

Return type

BaseStep

class neuraxle.base.ExecutionContext(root: str = '/home/gui/Documents/GIT/www.neuraxle.org-builder/docs/cache', execution_mode: neuraxle.base.ExecutionMode = None, stripped_saver: neuraxle.base.BaseSaver = None, parents=None)[source]

Execution context object containing all of the pipeline hierarchy steps. First item in execution context parents is root, second is nested, and so on. This is like a stack.

The execution context is used for fitted step saving, and caching :

See also

BaseStep, ValueCachingWrapper

copy()[source]
empty()[source]

Return True if the context has parent steps.

Returns

if parents len is 0

Return type

bool

get_execution_mode() → neuraxle.base.ExecutionMode[source]
get_names()[source]

Returns a list of the parent names.

Returns

list of parents step names

Return type

List[str]

get_path()[source]

Creates the directory path for the current execution context.

Returns

current context path

Return type

str

mkdir()[source]

Creates the directory to save the last parent step.

Returns

peek() → neuraxle.base.BaseStep[source]

Get last parent.

Returns

the last parent base step

Return type

BaseStep

pop() → bool[source]

Pop the context. Returns True if it successfully popped an item from the parents list.

Returns

if an item has been popped

Return type

bool

pop_item() → neuraxle.base.BaseStep[source]

Change the execution context to be the same as the latest parent context.

Returns

push(step: neuraxle.base.BaseStep) → neuraxle.base.ExecutionContext[source]

Pushes a step in the parents of the execution context.

Parameters

step (BaseStep) – step to add to the execution context

Returns

self

Return type

ExecutionContext

save_all_unsaved()[source]

Save all unsaved steps in the parents of the execution context using save(). This method is called from a step checkpointer inside a Checkpoint.

Returns

should_save_last_step() → bool[source]

Returns True if the last step should be saved.

Returns

if the last step should be saved

Return type

bool

class neuraxle.base.ExecutionMode[source]

An enumeration.

FIT = 'fit'[source]
FIT_OR_FIT_TRANSFORM = 'fit_or_fit_transform'[source]
FIT_OR_FIT_TRANSFORM_OR_TRANSFORM = 'fit_or_fit_transform_or_transform'[source]
FIT_TRANSFORM = 'fit_transform'[source]
TRANSFORM = 'transform'[source]
class neuraxle.base.ForceAlwaysHandleMixin[source]

A pipeline step that requires the implementation only of handler methods :

  • handle_transform

  • handle_fit_transform

  • handle_fit

See also

BaseStep

fit(data_inputs, expected_outputs=None) → neuraxle.base.ForceAlwaysHandleMixin[source]
fit_transform(data_inputs, expected_outputs=None) → neuraxle.base.ForceAlwaysHandleMixin[source]
handle_fit(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext)[source]
handle_fit_transform(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext)[source]
handle_transform(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext)[source]
transform(data_inputs) → neuraxle.base.ForceAlwaysHandleMixin[source]
class neuraxle.base.HashlibMd5Hasher[source]

Class to hash hyperparamters, and data input ids together using md5 algorithm from hashlib : https://docs.python.org/3/library/hashlib.html

The DataContainer class uses the hashed values for its current ids. BaseStep uses many BaseHasher objects to hash hyperparameters, and data inputs ids together after each transform.

See also

:class`BaseHasher`, DataContainer

hash(current_ids, hyperparameters, data_inputs: Any = None) → List[str][source]

Hash DataContainer.current_ids, data inputs, and hyperparameters together using hashlib.md5

Parameters
  • current_ids (List[str]) – current hashed ids (can be None if this function has not been called yet)

  • hyperparameters (HyperparameterSamples) – step hyperparameters to hash with current ids

  • data_inputs (Iterable) – data inputs to hash current ids for

Returns

the new hashed current ids

Return type

List[str]

single_hash(current_id: str, hyperparameters: neuraxle.hyperparams.space.HyperparameterSamples) → List[str][source]

Hash summary id, and hyperparameters together.

Parameters
  • current_id – current hashed id

  • hyperparameters (HyperparameterSamples) – step hyperparameters to hash with current ids

Returns

the new hashed current id

Return type

str

class neuraxle.base.Identity(savers=None, name=None)[source]

A pipeline step that has no effect at all but to return the same data without changes.

This can be useful to concatenate new features to existing features, such as what AddFeatures do.

Identity inherits from NonTransformableMixin and from NonFittableMixin which makes it a class that has no effect in the pipeline: it doesn’t require fitting, and at transform-time, it returns the same data it received.

class neuraxle.base.JoblibStepSaver[source]

Saver that can save, or load a step with joblib.load, and joblib.dump.

This saver is a good default saver when the object is already stripped out of things that would make it unserializable.

It is the default stripped_saver for the ExecutionContext. The stripped saver is the first to load the step, and the last to save the step. The saver receives a stripped version of the step so that it can be saved by joblib.

can_load(step: neuraxle.base.BaseStep, context: neuraxle.base.ExecutionContext) → bool[source]

Returns true if the given step has been saved with the given execution context.

Parameters
Returns

if we can load the step with the given context

Return type

bool

load_step(step: neuraxle.base.BaseStep, context: neuraxle.base.ExecutionContext) → neuraxle.base.BaseStep[source]

Load stripped step.

Parameters
Returns

save_step(step: neuraxle.base.BaseStep, context: neuraxle.base.ExecutionContext) → neuraxle.base.BaseStep[source]

Saved step stripped out of things that would make it unserializable.

Parameters
Returns

class neuraxle.base.MetaStepMixin(wrapped: neuraxle.base.BaseStep = None)[source]

A class to represent a step that wraps another step. It can be used for many things.

For example, ForEachDataInputs adds a loop before any calls to the wrapped step :

class ForEachDataInputs(MetaStepMixin, BaseStep):
    def __init__(
        self,
        wrapped: BaseStep
    ):
        BaseStep.__init__(self)
        MetaStepMixin.__init__(self, wrapped)

    def fit(self, data_inputs, expected_outputs=None):
        if expected_outputs is None:
            expected_outputs = [None] * len(data_inputs)

        for di, eo in zip(data_inputs, expected_outputs):
            self.wrapped = self.wrapped.fit(di, eo)

        return self

    def transform(self, data_inputs):
        outputs = []
        for di in data_inputs:
            output = self.wrapped.transform(di)
            outputs.append(output)

    return outputs

    def fit_transform(self, data_inputs, expected_outputs=None):
        if expected_outputs is None:
            expected_outputs = [None] * len(data_inputs)

        outputs = []
        for di, eo in zip(data_inputs, expected_outputs):
            self.wrapped, output = self.wrapped.fit_transform(di, eo)
        outputs.append(output)

        return self, outputs

See also

ForEachDataInputs, MetaSKLearnWrapper, RandomSearch, BaseCrossValidation, ValueCachingWrapper, StepClonerForEachDataInput

apply(method_name: str, *kargs, **kwargs) → neuraxle.base.BaseStep[source]

Apply the method name to the meta step and its wrapped step.

Parameters
  • method_name – method name that need to be called on all steps

  • kargs – any additional arguments to be passed to the method

  • kwargs – any additional positional arguments to be passed to the method

Returns

self (not a new step)

Return type

BaseStep

apply_method(method: Callable, *kargs, **kwargs) → neuraxle.base.BaseStep[source]

Apply method to the meta step and its wrapped step.

Parameters
  • method – method to call with self

  • kargs – any additional arguments to be passed to the method

  • kwargs – any additional positional arguments to be passed to the method

Returns

self (not a new step)

Return type

BaseStep

fit(data_inputs, expected_outputs)[source]
fit_transform(data_inputs, expected_outputs)[source]
get_best_model() → neuraxle.base.BaseStep[source]
get_hyperparams() → neuraxle.hyperparams.space.HyperparameterSamples[source]

Get step hyperparameters as HyperparameterSamples with flattened hyperparams.

Returns

step hyperparameters

Return type

HyperparameterSamples

See also

HyperparameterSamples

get_hyperparams_space() → neuraxle.hyperparams.space.HyperparameterSpace[source]

Get meta step and wrapped step hyperparams as a flat hyperparameter space

Returns

hyperparameters_space

Return type

HyperparameterSpace

get_step() → neuraxle.base.BaseStep[source]

Get wrapped step

Returns

self.wrapped

Return type

BaseStep

mutate(new_method='inverse_transform', method_to_assign_to='transform', warn=True) → neuraxle.base.BaseStep[source]

Mutate self, and self.wrapped. Please refer to mutate() for more information.

Parameters
  • new_method – the method to replace transform with, if there is no pending will_mutate_to call.

  • method_to_assign_to – the method to which the new method will be assigned to, if there is no pending will_mutate_to call.

  • warn – (verbose) wheter or not to warn about the inexistence of the method.

Returns

self, a copy of self, or even perhaps a new or different BaseStep object.

set_hyperparams(hyperparams: neuraxle.hyperparams.space.HyperparameterSamples) → neuraxle.base.BaseStep[source]

Set step hyperparameters, and wrapped step hyperparams with the given hyperparams.

Example :

step.set_hyperparams(HyperparameterSamples({
    'learning_rate': 0.10
    'wrapped__learning_rate': 0.10 # this will set the wrapped step 'learning_rate' hyperparam
}))
Parameters

hyperparams (HyperparameterSamples) – hyperparameters

Returns

self

Return type

BaseStep

See also

HyperparameterSamples

set_hyperparams_space(hyperparams_space: neuraxle.hyperparams.space.HyperparameterSpace) → neuraxle.base.BaseStep[source]

Set meta step and wrapped step hyperparams space using the given hyperparams space.

Parameters

hyperparams_space (HyperparameterSpace) – ordered dict containing all hyperparameter spaces

Returns

self

set_step(step: neuraxle.base.BaseStep) → neuraxle.base.BaseStep[source]

Set wrapped step to the given step.

Parameters

step (BaseStep) – new wrapped step

Returns

self

Return type

BaseStep

set_train(is_train: bool = True)[source]

Set pipeline step mode to train or test. Also set wrapped step mode to train or test.

For instance, you can add a simple if statement to direct to the right implementation:

def transform(self, data_inputs):
    if self.is_train:
        self.transform_train_(data_inputs)
    else:
        self.transform_test_(data_inputs)

def fit_transform(self, data_inputs, expected_outputs):
    if self.is_train:
        self.fit_transform_train_(data_inputs, expected_outputs)
    else:
        self.fit_transform_test_(data_inputs, expected_outputs)
Parameters

is_train – bool

Returns

setup() → neuraxle.base.BaseStep[source]

Initialize step before it runs. Also initialize the wrapped step.

Returns

self

Return type

BaseStep

should_resume(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext)[source]
transform(data_inputs)[source]
update_hyperparams(hyperparams: neuraxle.hyperparams.space.HyperparameterSamples) → neuraxle.base.BaseStep[source]

Update the step, and the wrapped step hyperparams without removing the already set hyperparameters. Please refer to update_hyperparams().

Parameters

hyperparams (HyperparameterSamples) – hyperparameters

Returns

self

Return type

BaseStep

See also

update_hyperparams(), HyperparameterSamples

will_mutate_to(new_base_step: Optional[neuraxle.base.BaseStep] = None, new_method: str = None, method_to_assign_to: str = None) → neuraxle.base.BaseStep[source]

Add pending mutate self, self.wrapped. Please refer to will_mutate_to() for more information.

Parameters
  • new_base_step (BaseStep) – if it is not None, upon calling mutate, the object it will mutate to will be this provided new_base_step.

  • method_to_assign_to (str) – if it is not None, upon calling mutate, the method_to_affect will be the one that is used on the provided new_base_step.

  • new_method (str) – if it is not None, upon calling mutate, the new_method will be the one that is used on the provided new_base_step.

Returns

self

Return type

BaseStep

class neuraxle.base.NonFittableMixin[source]

A pipeline step that requires no fitting: fitting just returns self when called to do no action. Note: fit methods are not implemented

fit(data_inputs, expected_outputs=None) → neuraxle.base.NonFittableMixin[source]

Don’t fit.

Parameters
  • data_inputs – the data that would normally be fitted on.

  • expected_outputs – the data that would normally be fitted on.

Returns

self

class neuraxle.base.NonTransformableMixin[source]

A pipeline step that has no effect at all but to return the same data without changes. Transform method is automatically implemented as changing nothing.

Example :

class PrintOnFit(NonTransformableMixin, BaseStep):
    def __init__(self):
        BaseStep.__init__(self)

    def fit(self, data_inputs, expected_outputs=None) -> 'FitCallbackStep':
        print((data_inputs, expected_outputs))
        return self

Note

fit methods are not implemented

inverse_transform(processed_outputs)[source]

Do nothing - return the same data.

Parameters

processed_outputs – the data to process

Returns

the processed_outputs, unchanged.

transform(data_inputs)[source]

Do nothing - return the same data.

Parameters

data_inputs – the data to process

Returns

the data_inputs, unchanged.

class neuraxle.base.ResumableStepMixin[source]

Mixin to add resumable function to a step, or a class that can be resumed, for example a checkpoint on disk.

should_resume(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → bool[source]

Returns True if a step can be resumed with the given the data container, and execution context. See Checkpoint class documentation for more details on how a resumable checkpoint works.

Parameters
  • data_container – data container to resume from

  • context – execution context to resume from

Returns

if we can resume

Return type

bool

class neuraxle.base.TruncableJoblibStepSaver[source]

Step saver for a TruncableSteps. TruncableJoblibStepSaver saves, and loads all of the sub steps using their savers.

load_step(step: neuraxle.base.TruncableSteps, context: neuraxle.base.ExecutionContext) → neuraxle.base.TruncableSteps[source]
  1. Loop through all of the sub steps savers, and only load the sub steps that have been saved.

  2. Refresh steps

Parameters
Returns

loaded truncable steps

Return type

TruncableSteps

save_step(step: neuraxle.base.TruncableSteps, context: neuraxle.base.ExecutionContext)[source]
  1. Loop through all the steps, and save the ones that need to be saved.

  2. Add a new property called sub step savers inside truncable steps to be able to load sub steps when loading.

  3. Strip steps from truncable steps at the end.

Parameters
Returns

class neuraxle.base.TruncableSteps(steps_as_tuple: List[Union[Tuple[str, BaseStep], BaseStep]], hyperparams: neuraxle.hyperparams.space.HyperparameterSamples = {}, hyperparams_space: neuraxle.hyperparams.space.HyperparameterSpace = {})[source]

Step that contains multiple steps. Pipeline inherits form this class. It is possible to truncate this step * __getitem__()

  • self.steps contains the actual steps

  • self.steps_as_tuple contains a list of tuple of step name, and step

See also

Pipeline, FeatureUnion

append(item: Tuple[str, BaseStep]) → neuraxle.base.TruncableSteps[source]

Add an item to steps as tuple.

Parameters

item (Tuple[str, 'BaseStep']) – item tuple (step name, step)

Returns

self

Return type

TruncableSteps

apply(method_name: str, *kargs, **kwargs) → neuraxle.base.BaseStep[source]

Apply the method name to the pipeline step and all of its children.

Parameters
  • method_name – method name that need to be called on all steps

  • kargs – any additional arguments to be passed to the method

  • kwargs – any additional positional arguments to be passed to the method

Returns

self (not a new step)

Return type

BaseStep

apply_method(method: Callable, *kargs, **kwargs) → neuraxle.base.BaseStep[source]

Apply a method to the pipeline step and all of its children.

Parameters
  • method – method to call with self

  • kargs – any additional arguments to be passed to the method

  • kwargs – any additional positional arguments to be passed to the method

Returns

self (not a new step)

Return type

BaseStep

are_steps_before_index_the_same(other: neuraxle.base.TruncableSteps, index: int) → bool[source]

Returns true if self.steps before index are the same as other.steps before index.

Parameters
  • other (TruncableSteps) – other truncable steps to compare

  • index (int) – max step index to compare

Returns

bool

Return type

bool

ends_with(step_type: type)[source]

Returns true if truncable steps end with a step of the given type.

Parameters

step_type (type) – step type

Returns

if truncable steps ends with the given step type

Return type

bool

get_hyperparams() → neuraxle.hyperparams.space.HyperparameterSamples[source]

Get step hyperparameters as HyperparameterSamples.

Example :

p = Pipeline([SomeStep()])
p.set_hyperparams(HyperparameterSamples({
    'learning_rate': 0.1,
    'some_step__learning_rate': 0.2 # will set SomeStep() hyperparam 'learning_rate' to 0.2
}))

hp = p.get_hyperparams()
# hp ==>  { 'learning_rate': 0.1, 'some_step__learning_rate': 0.2 }
Returns

step hyperparameters

Return type

HyperparameterSamples

See also

HyperparameterSamples

get_hyperparams_space()[source]

Get step hyperparameters space as HyperparameterSpace.

Example :

p = Pipeline([SomeStep()])
p.set_hyperparams_space(HyperparameterSpace({
    'learning_rate': RandInt(0,5),
    'some_step__learning_rate': RandInt(0, 10) # will set SomeStep() 'learning_rate' hyperparam space to RandInt(0, 10)
}))

hp = p.get_hyperparams_space()
# hp ==>  { 'learning_rate': RandInt(0,5), 'some_step__learning_rate': RandInt(0,10) }
Returns

step hyperparameters space

Return type

HyperparameterSpace

See also

HyperparameterSpace

items() → ItemsView[source]

Returns all of the steps as tuples items (step_name, step).

Returns

step items tuple : (step name, step)

Return type

ItemsView

keys() → KeysView[source]

Returns the step names.

Returns

list of step names

Return type

KeysView

mutate(new_method='inverse_transform', method_to_assign_to='transform', warn=True) → neuraxle.base.BaseStep[source]

Call mutate on every steps the the present truncable step contains.

Parameters
  • new_method – the method to replace transform with.

  • method_to_assign_to – the method to which the new method will be assigned to.

  • warn – (verbose) wheter or not to warn about the inexistence of the method.

Returns

self, a copy of self, or even perhaps a new or different BaseStep object.

pop() → neuraxle.base.BaseStep[source]

Pop the last step.

Returns

last step

Return type

BaseStep

popfront() → neuraxle.base.BaseStep[source]

Pop the first step.

Returns

first step

Return type

BaseStep

popfrontitem() → Tuple[str, neuraxle.base.BaseStep][source]

Pop the first step.

Returns

first step item

Return type

Tuple[str, BaseStep]

popitem(key=None) → Tuple[str, neuraxle.base.BaseStep][source]

Pop the last step, or the step with the given key

Parameters

key (str) – step name to pop, or None

Returns

last step item

Return type

Tuple[str, BaseStep]

set_hyperparams(hyperparams: dict) → neuraxle.base.BaseStep[source]

Set step hyperparameters to the given HyperparameterSamples.

Example :

p = Pipeline([SomeStep()])
p.set_hyperparams(HyperparameterSamples({
    'learning_rate': 0.1,
    'some_step__learning_rate': 0.2 # will set SomeStep() hyperparam 'learning_rate' to 0.2
}))
Returns

step hyperparameters

Return type

HyperparameterSamples

See also

HyperparameterSamples

set_hyperparams_space(hyperparams_space: dict) → neuraxle.base.BaseStep[source]

Set step hyperparameters space as HyperparameterSpace.

Example :

p = Pipeline([SomeStep()])
p.set_hyperparams_space(HyperparameterSpace({
    'learning_rate': RandInt(0,5),
    'some_step__learning_rate': RandInt(0, 10) # will set SomeStep() 'learning_rate' hyperparam space to RandInt(0, 10)
}))
Parameters

hyperparams_space (Union[HyperparameterSpace, OrderedDict, dict]) – hyperparameters space

Returns

self

Return type

BaseStep

See also

HyperparameterSpace

set_steps(steps_as_tuple: List[Union[Tuple[str, BaseStep], BaseStep]])[source]

Set steps as tuple.

Parameters

steps_as_tuple (NamedTupleList) – list of tuple containing step name and step

Returns

set_train(is_train: bool = True) → neuraxle.base.BaseStep[source]

Set pipeline step mode to train or test.

In the pipeline steps functions, you can add a simple if statement to direct to the right implementation:

def transform(self, data_inputs):
    if self.is_train:
        self.transform_train_(data_inputs)
    else:
        self.transform_test_(data_inputs)

def fit_transform(self, data_inputs, expected_outputs):
    if self.is_train:
        self.fit_transform_train_(data_inputs, expected_outputs)
    else:
        self.fit_transform_test_(data_inputs, expected_outputs)
Parameters

is_train (bool) – if the step is in train mode (True) or test mode (False)

Returns

self

setup() → neuraxle.base.BaseStep[source]

Initialize step before it runs.

Returns

self

Return type

BaseStep

should_save()[source]

Returns if the step needs to be saved or not. If self should be saved or any of his sub steps, return True.

Returns

split(step_type: type) → List[neuraxle.base.TruncableSteps][source]

Split truncable steps by a step class name.

Parameters

step_type (str) – step class type to split from.

Returns

list of truncable steps containing the splitted steps

teardown() → neuraxle.base.BaseStep[source]

Teardown step after program execution. Teardowns all of the sub steps as well.

Returns

self

Return type

BaseStep

update_hyperparams(hyperparams: dict) → neuraxle.base.BaseStep[source]

Update the steps hyperparameters without removing the already-set hyperparameters. Please refer to update_hyperparams().

Parameters

hyperparams (HyperparameterSamples) – hyperparams to update

Returns

step

Return type

BaseStep

See also

update_hyperparams(), HyperparameterSamples

values() → ValuesView[source]

Get step values.

Returns

all of the steps

Return type

ValuesView