top of page

Subscribe to our newsletter - Modern Data Stack

Thanks for subscribing!

Writer's pictureVedant Koshatwar

Setup Apache Airflow on your machine in 5 minutes or less!

Apache Airflow is an open-source platform for developing, scheduling, and monitoring batch-oriented workflows. Airflow's extensible Python framework enables you to build workflows connecting with virtually any technology. You can easily visualize your data pipeline's dependencies, progress, logs, code, trigger tasks, and success status. With Airflow, users can author workflows as Directed Acyclic Graphs (DAGs) of tasks. Let's Setup Apache Airflow on your machine in 5 minutes or less. This installation guide is for Mac users.


Pre-requisites

Before you get started, make sure you have Docker and Python 3.7+ installed on your machine. Workflows in Airflow are defined as Python code which means one must be familiar with Python.


Step #1: Install Apache Airflow

Create and Navigate to the directory where you want to setup Airflow.

mkdir airflow

Note - Airflow installation can be tricky sometimes because Airflow is both a library and an application. This means that from time to time plain pip will not work or will produce an unusable Airflow installation.

pip install apache-airflow 


To install airflow use this command and specify apache version.

pip install "apache-airflow[celery]==2.5.1" --constraint "https://raw.githubusercontent.com/apache/airflow/constraints-2.5.1/constraints-3.7.txt"


Step #2: Verify if the installation has been completed


All you do is type this command:

airflow version

The output should look something like this:


Step #3 - Let's get started

Create metadata storage directories using this command.

mkdir -p ~/airflow

Setup Airflow User.

airflow users create \
--username airflow \ 
--password airflow \ 
--firstname yourFirstName \ 
--lastname yourLastName \ 
--role Admin \ 
--email airflow@example.com

To run apache airflow use this docker command

// This will run all necessary docker images 
docker-compose up

Step #4 - Access the Airflow UI and start managing your DAGS


Open any browser and go to http://localhost:8080/. Port 8080 should be the default port for Airflow.

After logging in using our airflow username and password, we should see the webserver UI of airflow.


That's it - you're already up and running with Apache Airflow.


Conclusion

Hope you enjoyed this article. This is a series of 5-minute articles for anyone looking to quickly get set up on the tools of the modern data stack.





33 views0 comments

Comments


bottom of page