Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add dbt for data modeling #248

Open
wants to merge 8 commits into
base: main
Choose a base branch
from
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 2 additions & 0 deletions Dockerfile
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,8 @@ COPY quotaclimat ./quotaclimat
COPY postgres ./postgres
COPY alembic/ ./alembic
COPY transform_program.py ./transform_program.py
COPY _dbt/ ./_dbt
COPY profiles.yml ./profiles.yml

# Docker compose overwrite this config to have only one Dockerfile
CMD ["ls"]
4 changes: 4 additions & 0 deletions _dbt/.gitignore
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@

target/
dbt_packages/
logs/
15 changes: 15 additions & 0 deletions _dbt/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
Welcome to your new dbt project!

### Using the starter project

Try running the following commands:
- dbt run
- dbt test
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pour la production, j'imagine qu'on devra ajouter dbt run dans le bash de lancement de l'image docker

https://github.com/dataforgoodfr/quotaclimat/blob/main/docker-entrypoint.sh#L5



### Resources:
- Learn more about dbt [in the docs](https://docs.getdbt.com/docs/introduction)
- Check out [Discourse](https://discourse.getdbt.com/) for commonly asked questions and answers
- Join the [chat](https://community.getdbt.com/) on Slack for live discussions and support
- Find [dbt events](https://events.getdbt.com) near you
- Check out [the blog](https://blog.getdbt.com/) for the latest news on dbt's development and best practices
Empty file added _dbt/analyses/.gitkeep
Empty file.
36 changes: 36 additions & 0 deletions _dbt/dbt_project.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,36 @@

# Name your project! Project names should contain only lowercase characters
# and underscores. A good package name should reflect your organization's
# name or the intended use of these models
name: '_dbt'
version: '1.0.0'

# This setting configures which "profile" dbt uses for this project.
profile: '_dbt'

# These configurations specify where dbt should look for different types of files.
# The `model-paths` config, for example, states that models in this project can be
# found in the "models/" directory. You probably won't need to change these!
model-paths: ["models"]
analysis-paths: ["analyses"]
test-paths: ["tests"]
seed-paths: ["seeds"]
macro-paths: ["macros"]
snapshot-paths: ["snapshots"]

clean-targets: # directories to be removed by `dbt clean`
- "target"
- "dbt_packages"


# Configuring models
# Full documentation: https://docs.getdbt.com/docs/configuring-models

# In this example config, we tell dbt to build all models in the example/
# directory as views. These settings can be overridden in the individual model
# files using the `{{ config(...) }}` macro.
models:
_dbt:
# Config indicated by + and applies to all files under models/example/
example:
+materialized: view
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cette config permet à _dbt/models/intermediate/int_keywords_aggregated_by_days_and_channel.sql de devenir une vue matérialisée j'imagine ?

Empty file added _dbt/macros/.gitkeep
Empty file.
27 changes: 27 additions & 0 deletions _dbt/models/example/my_first_dbt_model.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,27 @@

/*
Welcome to your first dbt model!
Did you know that you can also configure models directly within SQL files?
This will override configurations stated in dbt_project.yml
Try changing "table" to "view" below
*/

{{ config(materialized='table') }}

with source_data as (
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ici on mettrait la requête SQL qu'on souhaite transformer en vue, c'est bien ça ?


select 1 as id
union all
select null as id

)

select *
from source_data

/*
Uncomment the line below to remove records with null `id` values
*/

-- where id is not null
6 changes: 6 additions & 0 deletions _dbt/models/example/my_second_dbt_model.sql
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@

-- Use the `ref` function to select from other models

select *
from {{ ref('my_first_dbt_model') }}
where id = 1
21 changes: 21 additions & 0 deletions _dbt/models/example/schema.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@

version: 2

models:
- name: my_first_dbt_model
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

j'imagine que c'est ici qu'on décrit le modèle Keywords

description: "A starter dbt model"
columns:
- name: id
description: "The primary key for this table"
data_tests:
- unique
- not_null

- name: my_second_dbt_model
description: "A starter dbt model"
columns:
- name: id
description: "The primary key for this table"
data_tests:
- unique
- not_null
Empty file added _dbt/seeds/.gitkeep
Empty file.
Empty file added _dbt/snapshots/.gitkeep
Empty file.
Empty file added _dbt/tests/.gitkeep
Empty file.
25 changes: 25 additions & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -26,6 +26,8 @@ services:
- ./pyproject.toml:/app/pyproject.toml
- ./alembic:/app/alembic
- ./alembic.ini:/app/alembic.ini
- ./_dbt:/app/_dbt
- ./profiles.yml:/app/profiles.yml
depends_on:
nginxtest:
condition: service_healthy
Expand Down Expand Up @@ -107,6 +109,29 @@ services:
postgres_db:
condition: service_healthy

dbt_runner:
build:
context: ./
dockerfile: Dockerfile
entrypoint: [ "poetry", "run", "dbt", "run", "--project-dir", "/app/_dbt" ]
environment:
ENV: docker
PYTHONPATH: /app
POSTGRES_USER: user
POSTGRES_DB: barometre
POSTGRES_PASSWORD: password
POSTGRES_HOST: postgres_db
POSTGRES_PORT: 5432
tty: true
volumes:
- ./quotaclimat/:/app/quotaclimat/
- ./postgres/:/app/postgres/
- ./_dbt:/app/_dbt
- ./profiles.yml:/app/profiles.yml
depends_on:
postgres_db:
condition: service_healthy

postgres_db:
image: postgres:15
ports:
Expand Down
12 changes: 12 additions & 0 deletions profiles.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
_dbt:
target: dev
outputs:
dev:
type: postgres
host: postgres_db
user: user
password: password
port: 5432
dbname: barometre
schema: public
threads: 1
2 changes: 2 additions & 0 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,8 @@ sentry-sdk = "^2.13.0"
modin = {extras = ["ray"], version = "^0.32.0"}
numpy = "1.26.4"
openpyxl = "^3.1.5"
dbt-core = "^1.8.7"
dbt-postgres = "^1.8.2"
[build-system]
requires = ["poetry-core>=1.1"]
build-backend = "poetry.core.masonry.api"
Expand Down