Class diagrams and inheritance charts of Neuraxle objects¶
Here is a description of the most used classes in Neuraxle. Reading this page will help you understand more on how they relate to each other, as the inherited classes and they order, combined with mixins, is often important in Neuraxle.
The Mixin design pattern in machine learning¶
Understanding the design pattern “Mixin” is important. For this, we will refer you to the wikipedia page on mixins, which is a good place to start. Mixins are a way to solve the Diamond Problem that happen when two sub-sub-classes inherit multiple times from the base class. It is at this point that Mixins are required.
The Mixin design pattern, in Neuraxle, is an important one to respect the SOLID Principles as established by Robert C. Martin, the author of Clean Code. Especially, mixins help to respect the Interface Segregation Principle (ISP), that is the “I” in SOLID. You may enjoy reading Umaneo’s article on SOLID Machine Learning which does a good job of covering the importance of the ISP principle, as well as the other principles, in ML projects.
The BaseStep of Neuraxle is itself a composition of many mixins, passing via the BaseTransformer that doesn’t have fit methods.
Steps containing other steps as the composite design pattern in machine learning¶
Steps can contain other steps, in a nested fashion. They can be traversed like a tree
apply() method when inheriting from the mixin
_HasChildrenMixin. To this effect, two class visible in the
inheritance diagram below are
This is in fact using the composite design pattern, that is the same way components
are coded in other frameworks like react or vue.js. Here is how the TruncableSteps
and MetaStep works - they also combine some Mixins, as they
are a BaseStep themselves and compose other base steps as childrens:
These steps, using the apply method, uses the
class for nested arguments, and combines recursively some
as return values of the traversed machine learning pipeline’s tree.
To summarize, a MetaStep is a step containing another one. We could say of a MetaStep that it is a decorator of another step, or a wrapper of another step to create the tree. The same goes with the TruncableSteps object. This one is a wrapper of multiple other objects. See it as a list that can be truncated.
Scikit-learn’s pipeline.Pipeline class and how to shift to parallel deep learning¶
Pipeline class is a wrapper of a list of steps, and it acts
just like the sklearn.pipeline.Pipeline object,
except that scikit-learn pipelines have limitations, and we’ve found some solutions to that.
The result is that Neuraxle Pipelines can properly do Deep Learning whereasvscikit-learn Pipelines can only do Machine Learning and with less features. Neuraxle pipelines are compatible with scikit-learn pipelines, thus helps scikit-learn to evolve, reusing its proven power.
It is a good example of how to build proper machine learning pipelines. However, in Neuraxle, we add more context to the pipeline,
using a ExecutionContext
Together, the following inheritance diagram shows the inheritance of all Pipeline classes, inheriting from the TruncableSteps class. As you can see, we also have built-in parallelism and minibatching ready to use:
It seems like there are a lot of classes to the right, but this is only because the
SequentialQueuedPipeline that allows you
to run a pipeline in a distributed environment has a lot under the hood for it to work.
Here are some examples on how to use the Pipeline class of Neuraxle.
Pipeline is the class that you will use the most often,
even within parallelized pipelines. Some examples are using sklearn estimators in the Neuraxle pipelines:
FeatureUnion to compute steps in parallel and join their results¶
FeatureUnion class and its parallel counterpart
both use joiners to join the results of the steps, such as the
NumpyConcatenateInnerFeatures class to concatenate
the results of the steps on the innermost dimension, that is often the features dimension.
See the inheritance diagram below to understand the inheritance of the FeatureUnion class, its parallel counterpart and the parallel joiner:
Here are some practical examples on how to use the FeatureUnion class of Neuraxle:
FeatureUnion we have here looks much like the one in sklearn,
but it is a class of Neuraxle. It is a wrapper of a list of steps, and it acts
with all the benefits that Neuraxle Pipelines have, such as parallelism, minibatching,
distributed execution, savers, built-in hyperparameter tuning, and neat hyperparameter spaces.
AutoML module to automatically tune hyperparameters of your pipelines¶
You machine learning pipelines may contain various data preprocessing steps
and models, as well as model selection steps. In order to automatically tune
the hyperparameters, you can use the
Here are some practical examples on how to use the AutoML class of Neuraxle:
You may also find interesting a whole lot of other modules, such as the flow module, in which lots of control-flow wrappers are defined. Those steps can often make conditional decisions on how your data will traverse the pipeline:
All the base classes of Neuraxle together¶
See how everything is combined together in the base.py module. Here are defined some other important base classes you can inherit from or use to do some funky data pipelines.
You may like to see all other inheritance diagrams defined in each module of the library. Refer to the complete API documentation of Neuraxle.