neuraxle.union

Module-level documentation for neuraxle.union. Here is an inheritance diagram, including dependencies to other base modules of Neuraxle:


Union of Features

This module contains steps to perform various feature unions and model stacking, using parallelism is possible.

Classes

AddFeatures(steps_as_tuple, …)

Parallelize the union of many pipeline steps AND concatenate the new features to the received inputs using Identity.

FeatureUnion(steps_as_tuple, …)

Parallelize the union of many pipeline steps.

ModelStacking(steps_as_tuple, …)

Performs a FeatureUnion of steps, and then send the joined result to the above judge step.

ZipFeatures([concatenate_inner_features])

This class receives an iterable of DataContainer and zips their feature together.

Examples using neuraxle.union.AddFeatures

Examples using neuraxle.union.FeatureUnion

Examples using neuraxle.union.ModelStacking


class neuraxle.union.FeatureUnion(steps_as_tuple: List[Union[Tuple[str, BaseTransformerT], BaseTransformerT]], joiner: neuraxle.base.BaseTransformer = None, n_jobs: int = None, backend: str = 'threading', cache_folder_when_no_handle: str = None)[source]

Bases: neuraxle.base.ForceHandleOnlyMixin, neuraxle.base.TruncableSteps

Parallelize the union of many pipeline steps.

p = Pipeline([
    FeatureUnion([
        Mean(),
        Std(),
    ], joiner=NumpyConcatenateInnerFeatures())
])

data_inputs = np.random.randint((1, 20))
__init__(steps_as_tuple: List[Union[Tuple[str, BaseTransformerT], BaseTransformerT]], joiner: neuraxle.base.BaseTransformer = None, n_jobs: int = None, backend: str = 'threading', cache_folder_when_no_handle: str = None)[source]

Create a feature union. :type cache_folder_when_no_handle: str :type backend: str :type n_jobs: int :type joiner: BaseTransformer :param steps_as_tuple: the NamedStepsList of steps to process in parallel and to join. :param joiner: What will be used to join the features. NumpyConcatenateInnerFeatures() is used by default. :param n_jobs: The number of jobs for the parallelized joblib.Parallel loop in fit and in transform. :param backend: The type of parallelization to do with joblib.Parallel. Possible values: “loky”, “multiprocessing”, “threading”, “dask” if you use dask, and more.

_fit_data_container(data_container, context)[source]

Fit the parallel steps on the data. It will make use of some parallel processing. :param data_container: The input data to fit onto :param context: execution context :return: self

_transform_data_container(data_container, context)[source]

Transform the data with the unions. It will make use of some parallel processing. :param data_container: data container :param context: execution context :return: the transformed data_inputs.

_did_transform(data_container, context)[source]

Apply side effects after transform.

Parameters
  • data_container – data container

  • context – execution context

Returns

data container

_fit_transform_data_container(data_container, context)[source]

Transform the data with the unions. It will make use of some parallel processing. :param data_container: data container :param context: execution context :return: the transformed data_inputs.

_save_fitted_steps(fitted_steps)[source]
_did_fit_transform(data_container, context)[source]

Apply side effects after fit transform.

Parameters
  • data_container – data container

  • context – execution context

Returns

(fitted self, data container)

_abc_impl = <_abc_data object>
class neuraxle.union.ZipFeatures(concatenate_inner_features=False)[source]

Bases: neuraxle.base.NonFittableMixin, neuraxle.base.BaseStep

This class receives an iterable of DataContainer and zips their feature together. If concatenate_inner_features is True, then features are concatenated after being zipped.

__init__(concatenate_inner_features=False)[source]

Initialize self. See help(type(self)) for accurate signature.

transform(data_inputs)[source]

Transform given data inputs.

Parameters

data_inputs – data inputs

Returns

transformed data inputs

_transform_data_container(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → neuraxle.data_container.DataContainer[source]

Transform data container.

Return type

DataContainer

Parameters
Returns

data container

_abc_impl = <_abc_data object>
class neuraxle.union.AddFeatures(steps_as_tuple: List[Union[Tuple[str, BaseTransformerT], BaseTransformerT]], **kwargs)[source]

Bases: neuraxle.union.FeatureUnion

Parallelize the union of many pipeline steps AND concatenate the new features to the received inputs using Identity.

pipeline = Pipeline([
    AddFeatures([
        PCA(n_components=2),
        FastICA(n_components=2),
    ])
])
__init__(steps_as_tuple: List[Union[Tuple[str, BaseTransformerT], BaseTransformerT]], **kwargs)[source]

Create a FeatureUnion where Identity is the first step so as to also keep the inputs to concatenate them to the outputs. :param steps_as_tuple: The steps to be sent to the FeatureUnion. Identity() is prepended. :param kwargs: Other arguments to send to FeatureUnion.

_abc_impl = <_abc_data object>
class neuraxle.union.ModelStacking(steps_as_tuple: List[Union[Tuple[str, BaseTransformerT], BaseTransformerT]], judge: neuraxle.base.BaseStep, **kwargs)[source]

Bases: neuraxle.union.FeatureUnion

Performs a FeatureUnion of steps, and then send the joined result to the above judge step.

Usage example:

model_stacking = Pipeline([
    ModelStacking([
        SKLearnWrapper(
            GradientBoostingRegressor(),
            HyperparameterSpace({
                "n_estimators": RandInt(50, 600), "max_depth": RandInt(1, 10),
                "learning_rate": LogUniform(0.07, 0.7)
            })
        ),
        SKLearnWrapper(
            KMeans(),
            HyperparameterSpace({
                "n_clusters": RandInt(5, 10)
            })
        ),
    ],
        joiner=NumpyTranspose(),
        judge=SKLearnWrapper(
            Ridge(),
            HyperparameterSpace({
                "alpha": LogUniform(0.7, 1.4),
                "fit_intercept": Boolean()
            })
        ),
    )
])
__init__(steps_as_tuple: List[Union[Tuple[str, BaseTransformerT], BaseTransformerT]], judge: neuraxle.base.BaseStep, **kwargs)[source]

Perform model stacking. The steps will be merged with a FeatureUnion, and the judge will recombine the predictions. :type judge: BaseStep :param steps_as_tuple: the NamedStepsList of steps to process in parallel and to join. :param judge: a BaseStep that will learn to judge the best answer and who to trust out of every parallel steps. :param kwargs: Other arguments to send to FeatureUnion.

_did_fit_transform(data_container, context) → Tuple[neuraxle.base.BaseStep, neuraxle.data_container.DataContainer][source]

Apply side effects after fit transform.

Parameters
  • data_container – data container

  • context – execution context

Returns

(fitted self, data container)

_did_fit(data_container: neuraxle.data_container.DataContainer, context: neuraxle.base.ExecutionContext) → neuraxle.data_container.DataContainer[source]

Fit the parallel steps on the data. It will make use of some parallel processing. Also, fit the judge on the result of the parallel steps. :rtype: DataContainer :type context: ExecutionContext :type data_container: DataContainer :param data_container: data container to fit on :param context: execution context :return: self

_did_transform(data_container, context) → neuraxle.data_container.DataContainer[source]

Transform the data with the unions. It will make use of some parallel processing. Then, use the judge to refine the transformations. :rtype: DataContainer :param data_container: data container to transform :param context: execution context

_abc_impl = <_abc_data object>