Project Overview

The goal of this project is to migrate an on-premise MySQL database to a cloud data warehouse to serve business' analytical needs, by creating an ELT pipeline that implements the medallion data architecture.

Architecture

Data Ingestion & Loading

The on-premise host is connected to the cloud using Microsoft Runtime Integration.

An Azure Data Factory pipeline then connects to the host, copies the data from all the tables, and loads them into a bronze container within Azure Data Lake in parquet format.

Data Transformation

A second step in the Azure Data Factory (ADF) pipeline connects to a Databricks cluster to invoke the first phase of transformation (Bronze to silver), by executing a notebook that filters and formats the data, which is then stored in Delta format, in the silver container within Azure Data Lake.

The last phase of transformations (Silver to Gold) is then performed in the same manner, placing the cleaned data in the gold container.

Data Modelling

Using Azure Synapse serverless SQL database as an analytics engine to directly query the data in the gold container, I created different views to reflect the different facts and dimension tables.

Analytics & Reporting

Done using PowerBI, which connects to the Azure Synapse SQL server, and loads the data.

Dashboard

Final thoughts

The sole goal of the project is to learn and get exposed to data engineering in Azure cloud, thus the use of some technologies and services is overkill for such a low amount of data.
Since the amount of transformations performed is light, sticking with Azure Data Factory is a more efficient and cost-effective approach.

Name		Name	Last commit message	Last commit date
Latest commit History 22 Commits
azure data factory		azure data factory
azure synapse analytics		azure synapse analytics
dashboards		dashboards
data		data
databricks		databricks
infrastructure		infrastructure
.gitignore		.gitignore
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Project Overview

Architecture

Data Ingestion & Loading

Data Transformation

Data Modelling

Analytics & Reporting

Dashboard

Final thoughts

About

Languages

OtmaneDaoudi/azure-migration-pipeline

Folders and files

Latest commit

History

Repository files navigation

Project Overview

Architecture

Data Ingestion & Loading

Data Transformation

Data Modelling

Analytics & Reporting

Dashboard

Final thoughts

About

Topics

Resources

Stars

Watchers

Forks

Languages