Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Make clean_stale_db_objects configurable #142

Merged
merged 1 commit into from
Sep 13, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
37 changes: 12 additions & 25 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,38 +12,25 @@ An Airflow-based dashboard for the LA Metro ETL pipeline!

Perform the following steps from your terminal.

1. Clone this repository and its submodule, then `cd` into the superproject.
1. Clone [the LA Metro Councilmatic repository](ttps://github.com/Metro-Records/la-metro-councilmatic) and follow the instructions in its
README to build and run the application.

```bash
git clone --recursive https://github.com/Metro-Records/la-metro-dashboard.git
cd la-metro-dashboard
```
2. Build `la-metro-dashboard` application, and create a local `.env` file. Fill
in the absolute location of your GPG keyring, usually the absolute path for ` ~/.gnupg`.
2. Clone this repository and create a local `.env` file.

```bash
docker-compose build
cp .env.example .env
# Fill in the correct value for GPG_KEYRING_PATH
```

3. Once the command exits, follow the instructions to build the [LA Metro Councilmatic application](https://github.com/Metro-Records/la-metro-councilmatic#setup)
```bash
cp .env.example .env
```

4. In order to run the `la-metro-dashboard` application, the `la-metro-councilmatic`
app must already be running. Open a new shell, move into the `la-metro-councilmatic`
application, and run it.
Fill in the absolute location of your GPG keyring, usually the absolute path for ` ~/.gnupg`.

```bash
cd la-metro-councilmatic && docker-compose up app
```

Once la-metro-councilmatic is running, in your first shell, run the la-metro-dashboard application.
3. Build and run the dashboard:

```bash
docker-compose up
```
```bash
docker-compose up
```

5. Finally, to visit the dashboard app, go to http://localhost:8080/admin/. The
4. Finally, to visit the dashboard app, go to http://localhost:8080/admin/. The
Councilmatic app runs on http://localhost:8001/.

See the Airflow documentation for more on [navigating the UI](https://airflow.apache.org/docs/stable/ui.html)
Expand Down
31 changes: 23 additions & 8 deletions dags/clean_stale_db_objects.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,14 @@
from datetime import timedelta

from airflow import DAG
from airflow.decorators import dag, task

from constants import (
LA_METRO_DATABASE_URL,
LA_METRO_SEARCH_URL,
LA_METRO_DOCKER_IMAGE_TAG,
LA_METRO_STAGING_DATABASE_URL,
START_DATE,
LA_SCRAPERS_IMAGE_URL
LA_SCRAPERS_IMAGE_URL,
)
from operators.blackbox_docker_operator import BlackboxDockerOperator

Expand Down Expand Up @@ -41,16 +41,31 @@
},
}

with DAG(
"clean_stale_db_objects",
default_args=default_args,

@dag(
schedule_interval="0 0 * * 0",
description="Deletes objects from the database that have not"
"been seen in a recent scrape",
) as dag:
default_args=default_args,
params={"window": 7, "max": 25, "report": False},
)
def clean_stale_db_objects(window=7, max=25, report=False):
@task
def get_flags(**kwargs):
if kwargs["params"]["report"]:
return "--report"
else:
return f"--window={kwargs['params']['window']} --max={kwargs['params']['max']} --yes"

BlackboxDockerOperator(
flags = get_flags()

pupa_clean = BlackboxDockerOperator(
task_id="clean_stale_db_objects",
environment=docker_base_environment,
command="pupa clean --noinput",
command=f"pupa clean {flags}",
)

flags >> pupa_clean


clean_stale_db_objects()
1 change: 1 addition & 0 deletions docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -82,3 +82,4 @@ volumes:
networks:
app_net:
name: la-metro-councilmatic_default
external: true
Loading