Skip to content

Commit

Permalink
Merge branch 'aux/8.0.0' into docs/glossary
Browse files Browse the repository at this point in the history
  • Loading branch information
droserasprout authored Sep 8, 2024
2 parents 06ae814 + 85467b0 commit af9f8da
Show file tree
Hide file tree
Showing 298 changed files with 1,901 additions and 2,882 deletions.
1,194 changes: 30 additions & 1,164 deletions CHANGELOG.md

Large diffs are not rendered by default.

11 changes: 5 additions & 6 deletions Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -88,13 +88,12 @@ demos: ## Recreate demo projects from templates
python scripts/demos.py init ${DEMO}
make format lint

demos_refresh:
for demo in `ls src | grep demo | grep -v etherlink`; do cd src/$$demo && dipdup init -b -f && cd ../..; done
make format lint

before_release: ## Prepare for a new release after updating version in pyproject.toml
make format
make lint
make update
make demos
make test
make docs
make format lint update demos test docs

jsonschemas: ## Dump config JSON schemas
python scripts/docs.py dump-jsonschema
Expand Down
12 changes: 6 additions & 6 deletions benchmarks/Makefile
Original file line number Diff line number Diff line change
@@ -1,20 +1,20 @@
SHELL=/bin/bash
SHELL=/usr/bin/zsh
DEMO=demo_evm_events

run_in_memory:
time dipdup -c ../src/${DEMO} -c ./oneshot_${DEMO}.yaml run

run_in_postgres:
touch ../src/${DEMO}/deploy/test.env && \
echo "HASURA_SECRET=test" > ../src/${DEMO}/deploy/test.env && \
echo "POSTGRES_PASSWORD=test" >> ../src/${DEMO}/deploy/test.env && \
cd ../src/${DEMO}/deploy && docker-compose --env-file test.env up -d db
touch ../src/${DEMO}/deploy/.env && \
echo "HASURA_SECRET=test" > ../src/${DEMO}/deploy/.env && \
echo "POSTGRES_PASSWORD=test" >> ../src/${DEMO}/deploy/.env && \
cd ../src/${DEMO}/deploy && docker-compose --env-file .env up -d db

export POSTGRES_PORT=`docker port ${DEMO}-db-1 5432 | cut -d: -f2` && \
time dipdup -c ../src/${DEMO} -c ./oneshot_${DEMO}.yaml -c ./local_postgres.yaml run

down:
cd ../src/${DEMO}/deploy && docker-compose down && rm test.env
cd ../src/${DEMO}/deploy && docker-compose down && rm .env
docker volume rm -f ${DEMO}_db

cpu_up:
Expand Down
36 changes: 18 additions & 18 deletions benchmarks/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -44,31 +44,31 @@ See the Makefile for details.
- interval: 10,000,000 to 10,100,000 (100,000 levels, 93,745 non-empty)
- database: in-memory sqlite

| run | time | bps | vs. asyncio | vs. 7.5 |
| ---------------- | ---------------------------------------------------- | --- | ----------- | ------- |
| 7.5.9, asyncio | 1044,56s user 258,07s system 102% cpu 21:06,02 total | 79 | | |
| 7.5.10, uvloop | 924,94s user 182,33s system 102% cpu 18:04,67 total | 92 | 1.15 | |
| 8.0.0b4, asyncio | 832,32s user 163,20s system 101% cpu 16:19,93 total | 102 | | 1.29 |
| 8.0.0b5, uvloop | 730,58s user 88,67s system 98% cpu 13:48,46 total | 121 | 1.18 | 1.31 |
| run | time | bps | vs. asyncio | vs. 7.5 |
| ---------------- | ---------------------------------------------------- | --------- | ----------- | ------- |
| 7.5.9, asyncio | 1044,56s user 258,07s system 102% cpu 21:06,02 total | 79 | | |
| 7.5.10, uvloop | 924,94s user 182,33s system 102% cpu 18:04,67 total | 92 | 1.15 | |
| 8.0.0b4, asyncio | 832,32s user 163,20s system 101% cpu 16:19,93 total | 102 | | 1.29 |
| 8.0.0, uvloop | 721,13s user 84,17s system 98% cpu 13:33,88 total | 123 (116) | 1.18 | 1.31 |

#### Without CPU boost

The same tests run without frequency boost, which increases frequency from 2.9 GHz base up to 4.2 GHz. Gives some understanding of the impact of CPU performance.

Run `echo 0 | sudo tee /sys/devices/system/cpu/cpufreq/boost`.

| run | time | bps | vs. boost |
| ------------------------- | ---------------------------------------------------- | --- | --------- |
| 7.5.10, uvloop, no boost | 1329,36s user 231,93s system 101% cpu 25:31,69 total | 65 | 0.82 |
| 8.0.0b5, uvloop, no boost | 1048,85s user 115,34s system 99% cpu 19:35,61 total | 85 | 0.70 |
| run | time | bps | vs. boost |
| ------------------------ | ---------------------------------------------------- | --- | --------- |
| 7.5.10, uvloop, no boost | 1329,36s user 231,93s system 101% cpu 25:31,69 total | 65 | 0.82 |
| 8.0.0, uvloop, no boost | 1048,85s user 115,34s system 99% cpu 19:35,61 total | 85 | 0.70 |

In the subsequent runs, we will skip the 7.5 branch; speedup vs 8.0 is pretty stable.

#### With PostgreSQL

| run | time | bps | vs. in-memory |
| --------------- | --------------------------------------------- | --- | ------------- |
| 8.0.0b5, uvloop | real 36m30,878s user 17m23,406s sys 3m38,196s | 46 | 0.38 |
| run | time | bps | vs. in-memory |
| ------------- | --------------------------------------------------- | ------- | ------------- |
| 8.0.0, uvloop | 1083,66s user 214,23s system 57% cpu 37:33,04 total | 46 (42) | 0.36 |

### starknet.events

Expand All @@ -79,13 +79,13 @@ In the subsequent runs, we will skip the 7.5 branch; speedup vs 8.0 is pretty st
| run | time | bps | speedup |
| ---------------- | ------------------------------------------------- | --- | ------- |
| 8.0.0b4, asyncio | 246,94s user 61,67s system 100% cpu 5:07,54 total | 326 | 1 |
| 8.0.0b5, uvloop | 213,01s user 33,22s system 96% cpu 4:14,32 total | 394 | 1.20 |
| 8.0.0, uvloop | 213,01s user 33,22s system 96% cpu 4:14,32 total | 394 | 1.20 |

#### With PostgreSQL

| run | time | bps | vs. in-memory |
| --------------- | ------------------------------------------- | --- | ------------- |
| 8.0.0b5, uvloop | real 12m6,394s user 5m24,683s sys 1m14,761s | 138 | 0.35 |
| run | time | bps | vs. in-memory |
| ------------- | ------------------------------------------- | --- | ------------- |
| 8.0.0, uvloop | real 12m6,394s user 5m24,683s sys 1m14,761s | 138 | 0.35 |

### tezos.big_maps

Expand All @@ -98,4 +98,4 @@ Only our code. And only 7% of blocks are non-empty.
| run | time | bps | speedup |
| ---------------- | ------------------------------------------------ | ---------- | ------- |
| 8.0.0b4, asyncio | 136,63s user 17,91s system 98% cpu 2:37,40 total | 3185 (221) | 1 |
| 8.0.0b5, uvloop | 124,44s user 9,75s system 98% cpu 2:16,80 total | 3650 (254) | 1.15 |
| 8.0.0, uvloop | 124,44s user 9,75s system 98% cpu 2:16,80 total | 3650 (254) | 1.15 |
4 changes: 2 additions & 2 deletions docs/0.quickstart-evm.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Let's create an indexer for the [USDt token contract](https://etherscan.io/addre

A modern Linux/macOS distribution with Python 3.12 installed is required to run DipDup.

The easiest way to install DipDup as a CLI application [pipx](https://pipx.pypa.io/stable/). We have a convenient wrapper script that installs DipDup for the current user. Run the following command in your terminal:
The recommended way to install DipDup CLI is [pipx](https://pipx.pypa.io/stable/). We also provide a convenient helper script that installs all necessary tools. Run the following command in your terminal:

{{ #include _curl-spell.md }}

Expand Down Expand Up @@ -157,6 +157,6 @@ If you use SQLite, run this query to check the data:
sqlite3 demo_evm_events.sqlite 'SELECT * FROM holder LIMIT 10'
```

If you run a Compose stack, check open `http://127.0.0.1:8080` in your browser to see the Hasura console (an exposed port may differ). You can use it to explore the database and build GraphQL queries.
If you run a Compose stack, open `http://127.0.0.1:8080` in your browser to see the Hasura console (an exposed port may differ). You can use it to explore the database and build GraphQL queries.

Congratulations! You've just created your first DipDup indexer. Proceed to the Getting Started section to learn more about DipDup configuration and features.
4 changes: 2 additions & 2 deletions docs/0.quickstart-starknet.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Let's create an indexer for the [USDt token contract](https://starkscan.co/contr

A modern Linux/macOS distribution with Python 3.12 installed is required to run DipDup.

The easiest way to install DipDup as a CLI application [pipx](https://pipx.pypa.io/stable/). We have a convenient wrapper script that installs DipDup for the current user. Run the following command in your terminal:
The recommended way to install DipDup CLI is [pipx](https://pipx.pypa.io/stable/). We also provide a convenient helper script that installs all necessary tools. Run the following command in your terminal:

{{ #include _curl-spell.md }}

Expand Down Expand Up @@ -157,6 +157,6 @@ If you use SQLite, run this query to check the data:
sqlite3 demo_starknet_events.sqlite 'SELECT * FROM holder LIMIT 10'
```

If you run a Compose stack, check open `http://127.0.0.1:8080` in your browser to see the Hasura console (an exposed port may differ). You can use it to explore the database and build GraphQL queries.
If you run a Compose stack, open `http://127.0.0.1:8080` in your browser to see the Hasura console (an exposed port may differ). You can use it to explore the database and build GraphQL queries.

Congratulations! You've just created your first DipDup indexer. Proceed to the Getting Started section to learn more about DipDup configuration and features.
4 changes: 2 additions & 2 deletions docs/0.quickstart-tezos.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ Let's create an indexer for the [tzBTC FA1.2 token contract](https://tzkt.io/KT1

A modern Linux/macOS distribution with Python 3.12 installed is required to run DipDup.

The easiest way to install DipDup as a CLI application [pipx](https://pipx.pypa.io/stable/). We have a convenient wrapper script that installs DipDup for the current user. Run the following command in your terminal:
The recommended way to install DipDup CLI is [pipx](https://pipx.pypa.io/stable/). We also provide a convenient helper script that installs all necessary tools. Run the following command in your terminal:

{{ #include _curl-spell.md }}

Expand Down Expand Up @@ -170,6 +170,6 @@ If you use SQLite, run this query to check the data:
sqlite3 demo_tezos_token.sqlite 'SELECT * FROM holder LIMIT 10'
```

If you run a Compose stack, check open `http://127.0.0.1:8080` in your browser to see the Hasura console (an exposed port may differ). You can use it to explore the database and build GraphQL queries.
If you run a Compose stack, open `http://127.0.0.1:8080` in your browser to see the Hasura console (an exposed port may differ). You can use it to explore the database and build GraphQL queries.

Congratulations! You've just created your first DipDup indexer. Proceed to the Getting Started section to learn more about DipDup configuration and features.
2 changes: 1 addition & 1 deletion docs/1.getting-started/1.installation.md
Original file line number Diff line number Diff line change
Expand Up @@ -52,4 +52,4 @@ pip install -r requirements.txt -e .

## Docker

For Docker installation, please refer to the [Docker](../6.deployment/2.docker.md) page.
For Docker installation, please refer to the [Docker](../6.deployment/1.docker.md) page.
Original file line number Diff line number Diff line change
Expand Up @@ -5,9 +5,40 @@ description: "Hooks are user-defined callbacks called either from the `ctx.fire_

# Hooks

Hooks are user-defined callbacks called either from the `ctx.fire_hook` method or by the job scheduler.
Hooks are user-defined callbacks not linked to any index. There are two types of hooks:

## Definition
- System hooks are called on system-wide events like process restart.
- User hooks are called either with the `ctx.fire_hook` method or by the job scheduler.

## System hooks

Every DipDup project has multiple system hooks; they fire on system-wide events and, like regular hooks, are not linked to any index. Names of those hooks are reserved; you can't use them in config. System hooks are not atomic and can't be fired manually or with a job scheduler.

You can also put SQL scripts in corresponding `sql/on_*` directories to execute them like with regular hooks.

### on_restart

This hook executes right before starting indexing. It allows configuring DipDup in runtime based on data from external sources. Datasources are already initialized at execution and available at `ctx.datasources`. You can, for example, configure logging here or add contracts and indexes in runtime instead of from static config.

SQL scripts in `sql/on_restart` directory may contain `CREATE OR REPLACE VIEW` or similar non-destructive operations.

### on_reindex

This hook fires after the database are re-initialized after reindexing (wipe) before starting indexing.

Helpful in modifying schema with arbitrary SQL scripts before indexing. For example, you can to change the database schema in ways that are not supported by the DipDup ORM, e.g., to create a composite primary key.

### on_synchronized

This hook fires when every active index reaches a realtime state. Here you can clear caches internal caches or do other cleanups.

### on_index_rollback

Fires when one of the index datasources received a chain reorg message.

Since version 6.0 this hook performs a database-level rollback by default. If you want to process rollbacks manually, remove `ctx.rollback` call and implement custom logic in this callback.

## User hooks

Let's assume we want to calculate some statistics on-demand to avoid blocking an indexer with heavy computations. Add the following lines to the DipDup config:

Expand All @@ -23,7 +54,7 @@ hooks:
Values of `args` mapping are used as type hints in a signature of a generated callback. The following callback stub will be created on init:

```python[hooks/calculate_stats.py]
```python [hooks/calculate_stats.py]
from dipdup.context import HookContext
async def calculate_stats(
Expand All @@ -36,15 +67,13 @@ async def calculate_stats(

By default, hooks execute SQL scripts from the corresponding subdirectory of `sql/`. Remove or comment out the `ctx.execute_sql` call to prevent it.

## Usage

To trigger the hook, call the `ctx.fire_hook` method from any callback:

```python
await ctx.fire_hook('calculate_stats', major=True, depth=10)
```

## Atomicity
### Atomicity and blocking

The `atomic` option defines whether the hook callback will be wrapped in a single SQL transaction or not. If this option is set to true main indexing loop will be blocked until hook execution is complete. Some statements, like `REFRESH MATERIALIZED VIEW`, do not require to be wrapped in transactions, so choosing a value of the `atomic` option could decrease the time needed to perform initial indexing.

Expand All @@ -57,35 +86,7 @@ async def handler(ctx: HandlerContext, ...) -> None:

This hook will be executed when the current transaction is committed.

## System hooks

Every DipDup project has multiple system hooks; they fire on system-wide events and, like regular hooks, are not linked to any index. Names of those hooks are reserved; you can't use them in config. System hooks are not atomic and can't be fired manually or with a job scheduler.

You can also put SQL scripts in corresponding `sql/on_*` directories to execute them like with regular hooks.

### on_restart

This hook executes right before starting indexing. It allows configuring DipDup in runtime based on data from external sources. Datasources are already initialized at execution and available at `ctx.datasources`. You can, for example, configure logging here or add contracts and indexes in runtime instead of from static config.

SQL scripts in `sql/on_restart` directory may contain `CREATE OR REPLACE VIEW` or similar non-destructive operations.

### on_reindex

This hook fires after the database are re-initialized after reindexing (wipe) before starting indexing.

Helpful in modifying schema with arbitrary SQL scripts before indexing. For example, you can to change the database schema in ways that are not supported by the DipDup ORM, e.g., to create a composite primary key.

### on_synchronized

This hook fires when every active index reaches a realtime state. Here you can clear caches internal caches or do other cleanups.

### on_index_rollback

Fires when one of the index datasources received a chain reorg message.

Since version 6.0 this hook performs a database-level rollback by default. If you want to process rollbacks manually, remove `ctx.rollback` call and implement custom logic in this callback.

## Arguments typechecking
### Arguments typechecking

DipDup will ensure that arguments passed to the hooks have the correct types when possible. `CallbackTypeError` exception will be raised otherwise. Values of an `args` mapping in a hook config should be either built-in types or `__qualname__` of external type like `decimal.Decimal`. Generic types are not supported: hints like `Optional[int] = None` will be correctly parsed during codegen but ignored on type checking.

Expand Down
5 changes: 2 additions & 3 deletions docs/1.getting-started/2.core-concepts.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,9 +19,8 @@ The Python package contains ORM models, callbacks, typeclasses, scripts and quer

As a result, you get a service responsible for filling the database with indexed data. Then you can use it to build a custom API backend or integrate with existing ones. DipDup provides _Hasura GraphQL Engine_ integration to expose indexed data via REST and GraphQL with zero configuration, but you can use other API engines like PostgREST or develop one in-house.

<!-- TODO: SVG include doesn't work -->

![Generic DipDup setup and data flow](../assets/dipdup.svg)
<!-- FIXME: Tezos-specific stuff -->
<!-- <center><img src="../../public/dipdup.svg" alt="DipDup data flow diagram" width="600"/></center> -->

## Storage layer

Expand Down
Loading

0 comments on commit af9f8da

Please sign in to comment.