Class diagrams and inheritance charts of Neuraxle objects

Here is a description of the most used classes in Neuraxle. Reading this page will help you understand more on how they relate to each other, as the inherited classes and they order, combined with mixins, is often important in Neuraxle.

The Mixin design pattern in machine learning

Understanding the design pattern “Mixin” is important. For this, we will refer you to the wikipedia page on mixins, which is a good place to start. Mixins are a way to solve the Diamond Problem that happen when two sub-sub-classes inherit multiple times from the base class. It is at this point that Mixins are required.

The Mixin design pattern, in Neuraxle, is an important one to respect the SOLID Principles as established by Robert C. Martin, the author of Clean Code. Especially, mixins help to respect the Interface Segregation Principle (ISP), that is the “I” in SOLID. You may enjoy reading Umaneo’s article on SOLID Machine Learning which does a good job of covering the importance of the ISP principle, as well as the other principles, in ML projects.

The BaseStep of Neuraxle is itself a composition of many mixins, passing via the BaseTransformer that doesn’t have fit methods.

Inheritance diagram of neuraxle.base.BaseStep

Steps containing other steps as the composite design pattern in machine learning

Steps can contain other steps, in a nested fashion. They can be traversed like a tree using the apply() method when inheriting from the mixin _HasChildrenMixin. To this effect, two class visible in the inheritance diagram below are TruncableSteps and MetaStep. This is in fact using the composite design pattern, that is the same way components are coded in other frameworks like react or vue.js. Here is how the TruncableSteps and MetaStep works - they also combine some Mixins, as they are a BaseStep themselves and compose other base steps as childrens:

Inheritance diagram of neuraxle.base.TruncableSteps, neuraxle.base.MetaStep, neuraxle.base._RecursiveArguments, neuraxle.hyperparams.space.HyperparameterSamples, neuraxle.hyperparams.space.HyperparameterSpace

These steps, using the apply method, uses the _RecursiveArguments class for nested arguments, and combines recursively some RecursiveDict as return values of the traversed machine learning pipeline’s tree.

To summarize, a MetaStep is a step containing another one. We could say of a MetaStep that it is a decorator of another step, or a wrapper of another step to create the tree. The same goes with the TruncableSteps object. This one is a wrapper of multiple other objects. See it as a list that can be truncated.

Scikit-learn’s pipeline.Pipeline class and how to shift to parallel deep learning

The neuraxle Pipeline class is a wrapper of a list of steps, and it acts just like the sklearn.pipeline.Pipeline object, except that scikit-learn pipelines have limitations, and we’ve found some solutions to that.

The result is that Neuraxle Pipelines can properly do Deep Learning whereasvscikit-learn Pipelines can only do Machine Learning and with less features. Neuraxle pipelines are compatible with scikit-learn pipelines, thus helps scikit-learn to evolve, reusing its proven power.

It is a good example of how to build proper machine learning pipelines. However, in Neuraxle, we add more context to the pipeline, using a ExecutionContext ExecutionContext object.

Together, the following inheritance diagram shows the inheritance of all Pipeline classes, inheriting from the TruncableSteps class. As you can see, we also have built-in parallelism and minibatching ready to use:

Inheritance diagram of neuraxle.pipeline.Pipeline, neuraxle.pipeline.MiniBatchSequentialPipeline, neuraxle.distributed.streaming.SequentialQueuedPipeline, neuraxle.base.ExecutionContext, neuraxle.base.ExecutionPhase, neuraxle.base.ExecutionMode

It seems like there are a lot of classes to the right, but this is only because the SequentialQueuedPipeline that allows you to run a pipeline in a distributed environment has a lot under the hood for it to work.

Here are some examples on how to use the Pipeline class of Neuraxle. This Pipeline Pipeline is the class that you will use the most often, even within parallelized pipelines. Some examples are using sklearn estimators in the Neuraxle pipelines:

Examples using neuraxle.pipeline.Pipeline


Examples using neuraxle.distributed.streaming.SequentialQueuedPipeline


FeatureUnion to compute steps in parallel and join their results

The FeatureUnion class and its parallel counterpart ParallelQueuedFeatureUnion both use joiners to join the results of the steps, such as the NumpyConcatenateInnerFeatures class to concatenate the results of the steps on the innermost dimension, that is often the features dimension.

See the inheritance diagram below to understand the inheritance of the FeatureUnion class, its parallel counterpart and the parallel joiner:

Inheritance diagram of neuraxle.union.FeatureUnion, neuraxle.distributed.streaming.ParallelQueuedFeatureUnion, neuraxle.steps.numpy.NumpyConcatenateInnerFeatures

Here are some practical examples on how to use the FeatureUnion class of Neuraxle:

Examples using neuraxle.union.FeatureUnion


The FeatureUnion we have here looks much like the one in sklearn, sklearn.pipeline.FeatureUnion, but it is a class of Neuraxle. It is a wrapper of a list of steps, and it acts with all the benefits that Neuraxle Pipelines have, such as parallelism, minibatching, distributed execution, savers, built-in hyperparameter tuning, and neat hyperparameter spaces.

AutoML module to automatically tune hyperparameters of your pipelines

You machine learning pipelines may contain various data preprocessing steps and models, as well as model selection steps. In order to automatically tune the hyperparameters, you can use the AutoML class.

Inheritance diagram of neuraxle.metaopt.auto_ml

Here are some practical examples on how to use the AutoML class of Neuraxle:

Examples using neuraxle.metaopt.auto_ml.AutoML


You may also find interesting a whole lot of other modules, such as the flow module, in which lots of control-flow wrappers are defined. Those steps can often make conditional decisions on how your data will traverse the pipeline:

Inheritance diagram of neuraxle.steps.flow

All the base classes of Neuraxle together

See how everything is combined together in the base.py module. Here are defined some other important base classes you can inherit from or use to do some funky data pipelines.

Inheritance diagram of neuraxle.base

You may like to see all other inheritance diagrams defined in each module of the library. Refer to the complete API documentation of Neuraxle.