Meltano is an easy to setup & use, open-source tool which can be used for "Data ops" - the ability for you to orchestrate various parts of your data pipeline, from extraction - transformation to even orchestrating visualization tools. The most interesting aspect for me, is the that it enables data extraction from a ton of different data sources. This article is a quick setup for anyone looking to try out Meltano, in 5 minutes or less (on your Mac machine), for data extraction.
Pre-requisites
Before you get started, make sure you have Python and pip.
Python: Download and run the latest installer from here
pip: Go to the terminal, and then run this command:
curl https://bootstrap.pypa.io/get-pip.py | python3
Verify if Python is installed by typing the following:
python --version
Step #1: Install Meltano
Create a directory where meltano will be installed on your machine, and then navigate to it
mkdir meltano-projects
cd meltano-projects
Install meltano
pip3 install meltano
Step #2: Verify if the installation has been completed
All you do is type this command:
meltano --version
The output should look something like this:
Step #3 - Let's get started
Meltano is now already installed on your machine. Let's start extracting some data. First create a project folder, and initialize the project
meltano init <project_name>
This command initializes a new project in the <project_name> directory, and creates all necessary files required for setting up your data extraction job.
Step #4 - Setting up your extractors, destinations and utilities
Meltano calls its extractors as "taps" and its destinations as "loaders". Setting up these taps and loaders is quite simple. To discover what extractors / loaders and other utilities your version of meltano comes shipped with, type the following command (make sure you navigate to your project folder!)
meltano discover
You should go ahead and add at least one extractor and one loader to get started. For this example, we will add a Google Analytics extractor and a Snowflake loader - this means that I'm trying to extract data from Google Analytics and load it to Snowflake. Here are the commands to get this done
meltano add extractor tap-google-analytics
meltano add loader target-snowflake
The last step for us before running a meltano job and running is to configure the taps and loaders. We do this by typing the following commands, and following the steps as shown by the system.
meltano config tap-google-analytics set --interactive
meltano config target-snowflake set --interactive
Note - Setting up an extract on Google Analytics requires that you already have a service account setup. Check this page for more.
Now, let's run your first meltano job! Type the command and watch the magic unfold
meltano run tap-google-analytics target-snowflake
This command will use the google analytics data extractor and send the result of the query to your configured target in Snowflake. There are a few steps here that we've not got into (for example configuring your entities in Google Analytics, that is for a later article)
That's it - you're already up and running with meltano. Here's what you can do next
Meltano is a very rich data ops orchestrator, discover more in the official documentation pages. We hope this article helped in getting you to quickstart of meltano!
This is a series of 5-minute articles for anyone looking to quickly get set up on the tools of the modern data stack.
Comments