neuraxle.steps.column_transformer

Module-level documentation for neuraxle.steps.column_transformer. Here is an inheritance diagram, including dependencies to other base modules of Neuraxle:


Neuraxle’s Column Transformer Steps

Pipeline steps to apply N-Dimensional column transformations to different columns.

Classes

ColumnSelector2D(columns_selection, …)

A ColumnSelector2D selects column in a sequence.

ColumnTransformer(…[, n_jobs])

A ColumnChooser can apply custom transformations to different columns.

ColumnsSelectorND(columns_selection[, …])

ColumnSelectorND wraps a ColumnSelector2D by as many ForEach step as needed to select the last dimension.

NumpyColumnSelector2D(columns_selection, …)

A numpy version of the ColumnSelector2D.

Examples using neuraxle.steps.column_transformer.ColumnTransformer


class neuraxle.steps.column_transformer.ColumnSelector2D(columns_selection: Union[int, Iterable[int], str, Iterable[str], slice])[source]

Bases: neuraxle.base.BaseTransformer

A ColumnSelector2D selects column in a sequence.

It can be used to select:

  • a single column,

  • a range of columns,

  • a slice of columns,

  • a list of columns.

The columns are expected to be integers. A special case is a string, which will be used as a pandas DataFrame column name.

__init__(columns_selection: Union[int, Iterable[int], str, Iterable[str], slice])[source]

Initialize self. See help(type(self)) for accurate signature.

transform(data_inputs)[source]

Transform given data inputs.

Parameters

data_inputs – data inputs

Returns

transformed data inputs

_abc_impl = <_abc_data object>
class neuraxle.steps.column_transformer.NumpyColumnSelector2D(columns_selection: Union[int, Iterable[int], str, Iterable[str], slice])[source]

Bases: neuraxle.base.BaseTransformer

A numpy version of the ColumnSelector2D.

__init__(columns_selection: Union[int, Iterable[int], str, Iterable[str], slice])[source]

Initialize self. See help(type(self)) for accurate signature.

transform(data_inputs)[source]

Transform given data inputs.

Parameters

data_inputs – data inputs

Returns

transformed data inputs

_abc_impl = <_abc_data object>
class neuraxle.steps.column_transformer.ColumnsSelectorND(columns_selection, n_dimension=2)[source]

Bases: neuraxle.base.MetaStep

ColumnSelectorND wraps a ColumnSelector2D by as many ForEach step as needed to select the last dimension. n_dimension must therefore be greater or equal to 2.

__init__(columns_selection, n_dimension=2)[source]

Initialize self. See help(type(self)) for accurate signature.

_abc_impl = <_abc_data object>
class neuraxle.steps.column_transformer.ColumnTransformer(column_chooser_steps_as_tuple: List[Tuple[Union[int, Iterable[int], str, Iterable[str], slice], neuraxle.base.BaseTransformer]], n_dimension: int = 3, n_jobs=None, joiner: neuraxle.base.BaseTransformer = None)[source]

Bases: neuraxle.union.FeatureUnion

A ColumnChooser can apply custom transformations to different columns. The ColumnChooser accepts a list of tuples for the transformations, and will name the steps accordingly (because of the TruncableSteps’ constructor) by converting each indexer object to a string. Indexer objects can be ranges, an int, or a list of ints. The input data can be N-dimensionnal (ND), in which case the axis must be specified. The columns data passed to the sub-steps will still be ND.

Usage example:

ColumnChooser([
    (range(0, 2), CyclicTimes()),
    (3, CategoricalEnum(categories_count=5, starts_at_zero=True)),
    (4, CategoricalEnum(categories_count=5, starts_at_zero=True)),
    ([10, 13, 15], CategoricalEnum(categories_count=5, starts_at_zero=True)),
])

See also

FeatureUnion,

__init__(column_chooser_steps_as_tuple: List[Tuple[Union[int, Iterable[int], str, Iterable[str], slice], neuraxle.base.BaseTransformer]], n_dimension: int = 3, n_jobs=None, joiner: neuraxle.base.BaseTransformer = None)[source]

Create a feature union. :type joiner: BaseTransformer :type n_dimension: int :param steps_as_tuple: the NamedStepsList of steps to process in parallel and to join. :param joiner: What will be used to join the features. NumpyConcatenateInnerFeatures() is used by default. :param n_jobs: The number of jobs for the parallelized joblib.Parallel loop in fit and in transform. :param backend: The type of parallelization to do with joblib.Parallel. Possible values: “loky”, “multiprocessing”, “threading”, “dask” if you use dask, and more.

_abc_impl = <_abc_data object>