A Comprehensive Comparison Between Metaflow and Airflow

Henrik SkogströmHenrik Skogström

The technology world is always evolving. Some help to improve existing products, while a few others can transform an entire industry. One highly disruptive innovation that we have seen in the last few years is the workflow orchestration tools allowing companies to manage their workflows more efficiently than ever before. This task becomes more and more of a pressing issue as organizations need to deal with the growing volumes of data.

Given the variety of workflow orchestration tools in existence, data science and ML teams sometimes face the challenge of selecting the most suitable one. Thus, we have taken it upon ourselves to compare the popular orchestration platforms to help such teams make an informed decision.

In this article, you will learn about the key differences and similarities between Metaflow and Airflow.

Components of Metaflow

Metaflow is a Python library that helps teams to build production machine learning. It was developed at Netflix initially to improve the productivity of data scientists who build and maintain different types of machine learning models. The major components of the Metaflow architecture are discussed as follows:

  • Flow: A flow is simply the smallest unit of computation that can be scheduled for execution. It defines a workflow that pulls data from an external source as input, processes it, and produces output data. To implement a flow, users need to subclass FlowSpec and implement steps as methods, parameters or data triggers. The flow code and its external dependencies are encapsulated in the execution environment.

  • Graph: Metaflow deduces a directed acyclic graph (DAG) based on the transitions between step functions. These transitions are necessary to ensure that the graph is parsed statically from the source code of the flow.

  • Step: A step can be defined as a checkpoint that provides fault tolerance for the system. Metaflow typically takes a snapshot of the data produced by a step and uses it as input to the subsequent steps. Therefore, if a step fails, it can be resumed without having to rerun the preceding steps. Decorators can be used to modify the behavior of a step. The body of a step is known as step code.

  • Runtime (Scheduler): The runtime or scheduler executes a flow; that is, it executes and orchestrates tasks defined by steps in a topological order. You can use the metaflow.client, a Python API, to access the results of runs.

  • Datastore: This is an object store where both data artifacts and code snapshots can be persisted. It can be accessible by all environments where the Metaflow code is executed.

Components of Airflow

Airflow is an open-source workflow management platform created by Airbnb in 2014 to programmatically author, monitor and schedule the firm's growing workflows. 

Airflow UI

Some of the components of Airflow include the following: 

  • Scheduler: Monitors tasks and DAGs, triggers scheduled workflows, and submits tasks to the executor to run. It is built to run as a persistent service in the Airflow production environment.

  • Executors: These are mechanisms that run task instances; they practically run everything in the scheduler. Executors have a common API and you can swap them based on your installation requirements. You can only have one executor configured per time.

  • Webserver: A user interface that displays the status of your jobs and allows you to view, trigger, and debug DAGs and tasks. It also helps you to interact with the database, read logs from the remote file store.

  • Metadata database: The metadata database is used by the executor, webserver, and scheduler to store state.

Similarities between Metaflow and Airflow

As 'peers' in the workflow orchestration world, Metaflow and Airflow share a few things in common. The main similarities between these two are:

  1. Both Metaflow and Airflow are open source tools. That is, they are easily accessible to anyone, from anywhere. The two also have great communities and users thereby making it easy for users to network and share ideas.

  2. Both platforms have a Scheduler. In Metaflow, the Scheduler is also known as Runtime. It executes and orchestrates tasks defined by steps in a topological order. In Airflow, the Scheduler monitors tasks and DAGs, triggers scheduled workflows and submits tasks to the executor to run.

  3. The two platforms have a user interface. In Airflow, the user interface provides a full overview of status and logs of all tasks, both completed and ongoing. In its own case, Metaflow recently added a user interface as a separate add service.

  4. Both of them utilize Python. In fact, Metaflow is completely built as a Python library. In Airflow, you use Python to define the operators and topology of the workflows.

Differences between Metaflow and Airflow

Metaflow and Airflow have key differences ranging from the core purpose of the platform to the companies that created each platform. We explore some of the differences in this section as follows:

  1. Metaflow was created by Netflix to improve the productivity of data scientists who work extensively on different projects, from classical statistics to deep learning. On the other hand, Airflow was developed by Airbnb to programmatically author, monitor and schedule workflows.

  2. The two platforms have different goals. Metaflow is practically built for Python-based ML workflows while Airflow has a strong focus on ETL pipelines.

  3. Metaflows approach guarantees that all steps fit together before executing a workflow. In Airflow, great power comes with great responsibility. The building blocks are completely decoupled and not guaranteed to fit together.

  4. Metaflow enables you to pass variables across steps and takes care of serialization and data transfer  thereby making it easy for you to focus on data science. The workflow still works even if the infrastructure underneath changes. However, state sharing is different in Airflow where steps are decoupled and responsible for serialization and data transfer themselves, forcing the workflow to be opinionated about infrastructure..

Summary

In a nutshell, if you want a scalable platform that allows you to manage ML workflows, Metaflow is a better choice as it has a wide variety of ML use cases. On the other hand, if you want to orchestrate a variety of tasks, Airflow is a viable option as it is a generic task orchestration platform.

Valohai as an Alternative for Metaflow and Airflow

[CAUTION: Opinions ahead] We are not going to argue against using tools like Airflow to build a single workflow. That makes total sense. We are also big fans of Metaflow. But we believe that there are better ways to solve tasks that these tools solve. And that would require a managed platform.

Warning

With Metaflow, you'll likely be looking at building production pipelines with it and supplementing other areas with tools such as BentoML (model deployment). This may be a good approach as you can adopt these parts as you need them, but ultimately adopting more open-source tools comes with more overhead.

Airflow is probably the more viable option as it is a generic task orchestration platform, If you need a platform to run a variety of tasks (and maybe you are just automating a single model workflow).

However, If your objective is to build a full-fledged MLOps stack, both options laid out above are hefty investments in terms of time and effort. The third option you might not be considering is a managed MLOps platform, namely Valohai. For many (dare we say most 😅) organizations, a managed alternative is a shortcut to embracing MLOps with all of its perks.

Screenshot of Valohai executions

Valohai is also a managed platform with a 2-week free trial period. We'll set up the platform on your infrastructure during the trial and go through an onboarding session to get you started. Our pricing model is based on the amount of users, so the cost doesn't scale with your utilization of the platform, saving you plenty in the long run. In addition, the computation you use will be billed by whatever cloud vendor you use and we never try to get a cut from that.

If you are interested in learning more, check out:

This article continues our series on common tools teams are comparing for various machine learning tasks. You can check out some previous Metaflow and Airflow comparison articles like: 

Practical MLOps

Free eBook

Practical MLOps

How to get started with MLOps?

Keep reading 🧐

;