Comparison to Other Machine Learning Pipeline Frameworks, and Compatibility

scikit-learn

Everything that works in sklearn is also useable in Neuraxle. Neuraxle is built in a way that does not replace what already exists. Therefore, Neuraxle adds more power to scikit-lean by providing neat abstractions, and neuraxle is even retrocompatible with sklean if it ever needed to be included in an already-existing sklearn pipeline. We believe that Neuraxle helps scikit-learn, and also scikit-learn will help Neuraxle. Neuraxle is best used with scikit-learn, and TensorFlow.

Also, the top core developers of scikit-learn, Andreas C. Müller, gave a talk in which he lists the elements that are yet to be done in scikit-learn. He refers to building bigger pipelines with automatic machine learning, meta learning, improving the abstractions of the search spaces, and he also points out that it would be possible do achieve that in another library which could reuse scikit-learn. Neuraxle is here to solve those problems that are actually shared by the open-source community in general. Let’s move forward with Neuraxle: join Neuraxle’s community and contribute.

Apache Beam

Apache Beam is a big, multi-language project and hence is complicated. Neuraxle is pythonic and user-friendly: it’s easy to get started.

Also, it seems that Apache Beam has GPL and MPL dependencies, which means Apache Beam might itself be copyleft (?). Neuraxle doesn’t have such copyleft dependencies.

spaCy

spaCy has copyleft dependencies or may download copyleft content, and it is built only for Natural Language Processing (NLP) projects. Neuraxle is open to any kind of machine learning projects and isn’t an NLP-first project.

Kubeflow

Kubeflow is cloud-first, using Kubernetes and is more oriented towards MLOps. Neuraxle isn’t built as a cloud-first solution and isn’t tied to Kubernetes. Neuraxle instead offers many parallel processing features, such as the ability to be scaled on many cores of a computer, and even on a computer cluster (e.g.: in the cloud using any cloud provider) with joblib. A Neuraxle project is best deployed the way you like it to be: as a microservice within your regular software environment, and you can fully control and customize how you deploy your project (e.g.: coding yourself a pipeline step that does json conversion to accept http requests).

TensorFlow

You can use TensorFlow within Neuraxle, using Neuraxle-TensorFlow. For instance, check out the seq2seq-attention example Neuraxle code, and the Multi-Layer-Perceptron (MLP) example Neuraxle code.

Hyperopt

Hyperopt is poorly maintained since a few years. Also, the Oriented Object Programming (OOP) design is poor. Neuraxle offers refreshing flexibility. The Parzen Tree Estimator (TPE) of Hyperopt has been coded anew from scratch in Neuraxle. Thanks to Éric Hamel who digged into Hyperopt to understand its inner workings and who translated it into Neuraxle code here. If you take a look at Hyperopt’s most commented, biggest issues, you will realize that Neuraxle is bringing a fresh breeze of working functionnalities that are put into clean code that can be easily manipulated to all your means and refactored into what you want, with proper respect to the Dependency Inversion Principle (that is, the “DIP” principle of SOLID principles).