September 03, 2020 | 4min read
Why Apache Airflow? Interview with Pinterest
Pinterest is an American image sharing and social media service designed to discover and save information online using images and, on a smaller scale, GIFs and videos, in the form of pinboards. Pinterest’s goal is to bring everyone inspiration to create the life they love.
We used to have an inhouse workflow system called Pinball.
Why have you decided to implement Apache Airflow? What was the problem you were facing and tried to solve?
We decided to use Airflow in 2019. We had some performance and scalability issues, were dealing with high maintenance cost, and needed more features than our Pinball offered. We also had a deep learning overhead.
Airflow is now the industry standard. It has a good feature alignment, is able to scale, and has a great reputation as the most popular workflow system in the open-source world.
We needed Airflow to be able to schedule workflows for running jobs like Hive, SparkSql, MapReduce, HadoopStreaming and so on.
Yes, we were thinking about Azkaban.
Source: Pinterest website
We are currently migrating all workflows from the old system to the Airflow-based tool called Spinner.
What type of deployment of Airflow have you used, and why? Was it determined by your current architecture?
We decoupled user dags and system codes to make 24/7 deployment possible for the users.
Our team deals with workflows, we also do internal support for Airflow users—we have oncall obligation to answer user questions every day. Additionally, we do onboarding internal training sessions for new teams and users.
We faced a few challenges. First, based on our current load, it’s not easy to use Airflow out of the box. This is why we introduced a multi-scheduler solution for letting schedulers process different dags and still use the same UI from a user perspective, in the meantime.
Since our users were familiar with the old DSL, it wasn’t easy for them to migrate old workflows to Airflow. This is why we introduced a migration tool to help them auto migrate to Airflow with just a few clicks. Also, we developed a guide and some Pinterest specific operators for users to be able to quickly onboard Airflow.
Airflow has an open-source k8s executor, however at Pinterest, we have a k8s team, which is using totally different APIs for k8s operations. Therefore, we had to rewrite the k8s executor to make it compatible with Pinterest k8s clusters.
Lastly, Airflow doesn’t support Dynamic Dags by default. This means some Dags can have different numbers and types of tasks during runtime. We had to leverage db and s3 with some serialization techniques to rewrite a decent amount of Airflow codes to support Dynamic Dags as they are supported in our old system.
K8s executor is one of the most important reasons for us to use Airflow. It makes our work easier in terms of having less maintenance work for hosts, and it allows us to schedule and run more jobs simultaneously. Airflow also has some great features such as ACL, Audition, Execution stats virtualization, Code isolation, some of which were not well integrated in our old system. It has a powerful UI and it’s easy for users to check their DAGs. Besides, users can do a lot of actions against DAGs/tasks from the UI directly. This reduces some user support burden from us.
On the negative side, a learning curve is not that high compared to other workflow systems, but it’s still there. Our team needs to get familiar with Airflow and its DSL writing DAGs. Users need to know the way to interact with Airflow and write native Airflow DAGs. We also have to rewrite many operators and create new ones due to APIs differences in Pinterest and the users’ needs. However, with good training and guidance, it’s possible to overcome this obstacle. Additionally, there are some gaps between our old system and Airflow. In order to make it transparent for the users to migrate old workflows to Airflow, we have to do a lot of work to fill this gap. Also, the users get used to some features that are only available in the old system, in which case, we need to bring them to Airflow as well.
I’d recommend it to any company that is using a system to schedule workflows. In general, it’s the best open-source workflow solution, really easy to use, and with great features. It’s also constantly evolving as there are many contributors working on it.
Software Engineer, Pinterest