What is Kubeflow?
The Machine Learning Toolkit for Kubernetes
Kubeflow is an open-source machine learning platform that aims to help teams develop machine learning pipelines, deploy models and manage the training infrastructure. It’s by far the most used of the current open source ML platforms and has wide community support with tutorials and extensions.
Kubeflow was inspired by the way Google run TensorFlow internally. Its original goal was to help manage TensorFlow experiments on Kubernetes but has since then expanded its focus and supports other clouds outside Google Cloud Platform (GCP) as well as other frameworks outside TensorFlow.
With Kubeflow, a data science team can move most of their ML work to run on a Kubernetes cluster. The main components of Kubeflow are hosted notebooks, training job management, pipelines.
What are the alternatives?
Kubeflow often draws comparisons to other open-source platforms, such as MLflow, Metaflow, and the less well-known Flyte. In this comparison, MLflow comes closest to feature parity, albeit its origins are more in experiment tracking than operationalizing models. Metaflow, on the other hand, is solely focused on machine learning pipelines.
Outside of open source, Kubeflow has many alternatives, including Valohai and AWS SageMaker. Both of these platforms resemble Kubeflow more than the other open-source alternatives in feature completeness. Comparing MLOps platforms is quite tricky as every use case is different, and teams will have different competencies. We’ve written more about comparing popular platforms and whether building or buying is the right approach for you.
Before you look into alternatives, though, it may be helpful to look at a few of the pros and cons of Kubeflow.
What are Kubeflow's pros and cons?
As with most things, Kubeflow has its pros and cons. So let’s look at a few reasons why you’d choose Kubeflow and a few reasons why not.
Why Kubeflow might be right for me
A machine learning platform is a must
Unless you are a single data scientist working exclusively on early-stage experiments, you’ll come to realize that you need a platform for developing and operationalizing machine learning. Kubeflow is an established machine learning platform that will indeed support you with a broad set of features.
Extendability through open source
If the out-of-the-box feature set is not broad enough, as an open-source platform, Kubeflow can be extended with open-source components or DIY components. The Kubeflow community is active, and there’s plenty of content available on how to integrate different technologies with Kubeflow.
Perfect for the Kubernetes enthusiast
For data science teams that include or are supported by DevOps talent, a Kubernetes-based platform can be ideal. Kubernetes allows for very granular control over how infrastructure is utilized, and it enables Kubeflow to run on any of the popular clouds.
Is it possible that Kubeflow pipeline is one of the best CI/CD tools for Kubernetes? I spent some time playing with Kubernetes & Kubeflow pipelines, and they have one feature which is just great: You can define the pipeline with real code! — Daniele Polencic
Why Kubeflow might NOT be right for me
Implementation is not straightforward
Kubeflow’s strength lies in its breadth. Its core components and extensions can cover most use cases, but unfortunately, getting your own Kubeflow setup up to that point may be a struggle. Kubeflows documentation is quite incomplete, and the most common complaint about Kubeflow is how difficult it is to set up.
Data scientists will require Ops support
Data scientists come in many forms; some will have an engineering background while others will not. Like most self-managed systems, debugging and solving issues comes with the territory, and Kubeflow is no exception. Will the people using Kubeflow in your team be able to troubleshoot a namespace error or will it be a blocker?
Perfect for the Kubernetes enthusiast
Kubernetes is Kubeflow’s double-edged sword. Teams that have used K8s to run services before will have an easier time, while for less savvy teams, the learning curve of Kubernetes and Kubeflow may be unsurmountable.
That said, from an ease-of-use perspective, Kubeflow doesn’t feel mature enough, particularly for such a complex system. Moreover, it assumes a lot of competency with Kubernetes and/or containers, which frankly is great if you have that and disappointing if you don’t — not every data science team will. — Byron Allen
The bottom line
Apples to oranges: Managed or self-managed?
Kubeflow is ultimately the heftiest open-source platform for MLOps, and that is its greatest strength and greatest weakness. Metaflow is focused on pipelines, MLflow is focused on experimentation, but Kubeflow covers it all. However, the ideal setting for Kubeflow is when you have the organization to support it.
If your company doesn’t have a platforms team dedicated to supporting internal tools, it may be better to look for managed solutions instead of self-managed ones. Valohai is a managed MLOps platform that rivals Kubeflow in its scope but doesn’t require DevOps proficiency on your part.