Add celo-migrate script

This works by loading the database of a celo node. It then removes all existing blocks and generates a new genesis block including the existing state tree. Migrate to urfave/cli/v2 Update op-chain-ops/cmd/op-migrate/main.go Co-authored-by: Karl Bartel <[email protected]> Combine Cel2 migration scripts (#148) * Initial script to play with celo DB history migration * Can Read All the headers Co-authored-by: Alec Schaefer <[email protected]> * Adds new command to migrate ancients db * Adds comment * Adds extension methods for transformation * Implements Transform CeloBody * Adds impl that runs steps in a concurrent pipeline * Adds transformHead, verify hashing works cleanup * add migration for non-frozen blocks * copy over entire db and modify in place, works with op-geth at piersy/minimal-data-migration * remove unecessary copying, cleanup code * close and reopen DBs * migrate newdb in place * saving progress Co-authored-by: Mariano Cortesi <[email protected]> * Refactor code to improve database migration process * better logging * refactor: inline parMigrateAncientRange * Remove frozen blocks from nonAncient DB * check hash matches on nonAncients migration * clean up branch Removes unused code, move code for better separation of concerns. * decode into new types * fix transformHeader * make old freezer not readonly so that .meta files are created * add configurable memory limit * add comment about memory * Added celo-dbmigrate Makefile target * Added dockerfile for celo-dbmigrate and celo-migrate tools * Workflow for running cel2-migration-tool * Update cel2-migration-tool image registry * update op-geth to point to https://github.com/celo-org/op-geth/commits/piersy/for-use-with-migrated-celo-datadir-use-gas-limit-differentiation-rebased-celo6/ * add celo6 logging * rename scripts to celo-migrate-state and celo-migrate-blocks * first pass at combining scripts * saving progress on testing * fix lint error, use %w to fmt errors * add updated state migration input files to testdata * add ability to run block and state migration seperately or together * add option for migrating only frozen blocks * remove old scripts * minor logging improvements in block migrations * invert clearNonAncients flag logic --> keepNonAncients, make dry-run flag only apply to state migration * adds README, improves logging * fix lint err * Fix Makefile and Dockerfile * move createNewDbIfNotExists * rename keep-non-ancients * update TODO to add more context and state changes * Remove channel buffers from ancients migration Co-authored-by: Valentin Rodygin <[email protected]> * bump default batch size to 100000 * add back extended usage string * add info on state migration to README * remove --state-dry-run flag * update default batch size to 50k * Adding building for op images * Setting our values for image registry and repository * update README * fix logging when newAncients > oldAncients * fix return value when skipping ancients * skip transforming block bodies that have already been transformed * misc. fixes to get re-runs with --keep-non-ancients working * adds TODO * addresses cosmetic feedback * add flag for specifying a buffer * Show progress on rsync * Update to latest op-geth * state-migration: Refactor subtask * state-migration: Use EIP1559 settings from deploy config Fixes #135 * state-migration: Enable Fjord hardfork during migration Fixes #160 * state-migration: Deterministicly set migration block timestamp Fixes #157 Sets the timestamp to be 5s after the last block. * state-migration: Set WithdrawalsHash in Cel2 migration block * fixup! Fix Makefile and Dockerfile * add note to README about using snapshots for pre-migration * Set blob gas header fields for transition block These are now required to be set since cancun was activated. * Use InitialBaseFee for pre-gingerbread transitionb * Fix warnings about capitalized error strings * Output chain config as marshalled JSON * state-migration: Handle accounts with existing balance Fixes #158 * remove allocs file, add instructions for how to generate allocs file to README, update TODOs --------- Co-authored-by: Mariano Cortesi <[email protected]> Co-authored-by: Alec Schaefer <[email protected]> Co-authored-by: Mariano Cortesi <[email protected]> Co-authored-by: Javier Cortejoso <[email protected]> Co-authored-by: Paul Lange <[email protected]> Co-authored-by: Valentin Rodygin <[email protected]> Co-authored-by: Piers Powlesland <[email protected]> Set balance of `CeloDistributionSchedule` contract (#162) * state-migration: Initialize CeloDistributionSchedule Fixes #155 * state-migration: Don't fail when distribution schedule update errors * Review comments state-migration: Set ParentBeaconRoot (#176) This allows header validation to pass during snap sync state-migration: Set address of distribution schedule (#177) state-migration: Read total supply directly from state (#182) * state-migration: Read totalSupply directly from storage * Added trigger for updated dependencies * Removen token bindings --------- Co-authored-by: Javier Cortejoso <[email protected]> Fix l2 block older than l1 origin error (#184) (#187) * Revert to using time.Now() for migration block Instead of simply adding 5 to the parent block time. We really do need a deterministic time for the migration block so that all parties that run the migration arrive at the same migration block but the problem is that op-geth requires that the L2 migration block (aka l2 origin) occurs after the l1 origin (I guess the point where you deploy the bridge contracts to the l1). When we migrate a partially synced datadir the block before the transition block will be very old, up to 4 years old! So of course it occurs before the l1 origin. So a fix just to get things working is to use time.Now(), but probably we should make this a configurable parameter. * add flag to specify timestamp * Update op-chain-ops/cmd/celo-migrate/main.go --------- Co-authored-by: piersy <[email protected]> Migration script fixes (#179) * Fixed migration for datadirs without ancients The script was assuming that ancients would have been migrated and was considering the numAncients-1 to be the next block to migrate but when numAncients is zero that's a problem. Also remved logic for picking up where db migration left of for the level db since it was complicating the logic and that process takes a few seconds, which is nothing compared with the minutes taken to migrate the ancients. * Ensure that we set gas limit if migrating at pre-gingerbread point Fix migration script gap in migrated blocks (#189) * Fix migration script gap in migrated blocks The range of ancient blocks to remove from the non ancients database was off by one and resulted in a gap between ancients and non ancients. Also corrected some log statements that were off by one. Add pre-migration command to migration script (#192) * add pre-migration command, rsync and ancients run in parallel, remove onlyAncients flag * remove block and state migration sub-commands * make non ancient migration its own step, add flag to measure time * add more granular timers * open db without freezer in state migration, remove clearAll * fix error * remove update flag from rsync command, add rsync comments * delete commented out versions of checkForPrevFullMigration * remove aliases * remove clearNonAncients flag * remove measureTime flag, always log time measurements * remove logging from help text * remove db reset * move scan for extra ancients into pre-migration * update README * rename extraAncientNumHashes to strayAncientBlocks state-migration: Fail if account would be overwritten (#202) * state-migration: Fail if account would be overwritten * Review changes * Review changes 2 * Fail in unclear state * more changes * Use whitelist to decide if nonce and state are overwritten Cosmetic changes to the migration script - Use more lists for added readability - Capitalize Alfajores and Celo - Reorder scripting instructions to fit the actual order or operations - Use GitHub callouts migration: Add tests (#217) * migration: Add tests for state migration * migration: Fix issues shown by tests * migration: pass allowlist into state migration Allows for easier testing * migration: Add test with allowlist * Correct overwrite counter * Use in memory DB migration: Add working allowlist for Alfajores (#220) * migration: Simplify tests * migration: Add working allowlist for Alfajores Adapt migration code to changes in StateDB StateDB.CreateAccount used to copy existing balance, now it does not any more. migration: Set fields correctly for migration block (#212) migration: Enable Granite (#226) Write genesis file in state migration (#219) * squash of #167 * add writeGenesis * open old freezer in readonly mode, fix locking error * remove devAlloc * Revert "open old freezer in readonly mode, fix locking error" This reverts commit e3fddea. * fix locking error * fix lint error, check errors, add comment * remove comment * filter extra genesis fields * fix issue with genesis extra data * update testdata --------- Co-authored-by: Javier Cortejoso <[email protected]> migration: Overwrite create2deployer code (#233) migration: Allow 'createx' preinstall (#238) The code already exists on Alfajores and matches the one that would be deployed, therefore we just allow this address. add migration-block-number flag (#245) * add migration-block-number flag * address feedback * move migration-block-number flag out of state migration options Fixes for re-running migration script on same destination db (#246) * add reset flag * add --checksum to rsync options
celo-org · Oct 15, 2024 · dfd6802 · dfd6802
1 parent 6886911
commit dfd6802
Show file tree

Hide file tree

Showing 21 changed files with 2,895 additions and 87 deletions.
diff --git a/.github/workflows/docker-build-scan.yaml b/.github/workflows/docker-build-scan.yaml
@@ -1,92 +1,90 @@
 name: Docker Build Scan
 on:
+  pull_request:
+    branches:
+      - 'master'
+      - 'celo*'
   workflow_dispatch:
 
 jobs:
-  Build-Scan-Container-op-ufm:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: op-ufm/Dockerfile
-
-  Build-Scan-Container-ops-bedrock-l1:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: ops-bedrock/Dockerfile.l1
-      context: ops-bedrock
-
-  Build-Scan-Container-ops-bedrock-l2:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: ops-bedrock/Dockerfile.l2
-      context: ops-bedrock
-
-  Build-Scan-Container-indexer:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: indexer/Dockerfile
-
-  Build-Scan-Container-op-heartbeat:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: op-heartbeat/Dockerfile
-
-  Build-Scan-Container-op-exporter:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: op-exporter/Dockerfile
-
-  Build-Scan-Container-op-program:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: op-program/Dockerfile
-
-  Build-Scan-Container-ops-bedrock:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: ops-bedrock/Dockerfile.stateviz
-
-  Build-Scan-Container-ci-builder:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: ops/docker/ci-builder/Dockerfile
-
-  Build-Scan-Container-proxyd:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: proxyd/Dockerfile
-
-  Build-Scan-Container-op-node:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: op-node/Dockerfile
-
-  Build-Scan-Container-op-batcher:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: op-batcher/Dockerfile
-
-  Build-Scan-Container-indexer-ui:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: indexer/ui/Dockerfile
-
-  Build-Scan-Container-op-proposer:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: op-proposer/Dockerfile
-
-  Build-Scan-Container-op-challenger:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: op-challenger/Dockerfile
-
-  Build-Scan-Container-endpoint-monitor:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: endpoint-monitor/Dockerfile
-
-  Build-Scan-Container-opwheel:
-    uses: celo-org/reusable-workflows/.github/workflows/[email protected]
-    with:
-      dockerfile: op-wheel/Dockerfile
-
+  detect-files-changed:
+    runs-on: ubuntu-latest
+    outputs:
+      files-changed: ${{ steps.detect-files-changed.outputs.all_changed_files }}
+    steps:
+      - uses: actions/checkout@v4
+      - name: Detect files changed
+        id: detect-files-changed
+        uses: tj-actions/changed-files@v44
+        with:
+          separator: ','
+
+  build-cel2-migration-tool:
+    runs-on: ubuntu-latest
+    needs: detect-files-changed
+    if: |
+      contains(needs.detect-files-changed.outputs.files-changed, 'go.sum') ||
+      contains(needs.detect-files-changed.outputs.files-changed, 'op-chain-ops/cmd/celo-migrate') ||
+      contains(needs.detect-files-changed.outputs.files-changed, 'op-chain-ops/Dockerfile')
+    permissions:
+      contents: read
+      id-token: write
+      security-events: write
+    steps:
+      - uses: actions/checkout@v4
+      - name: Login at GCP Artifact Registry
+        uses: celo-org/reusable-workflows/.github/actions/[email protected]
+        with:
+          workload-id-provider: 'projects/1094498259535/locations/global/workloadIdentityPools/gh-optimism/providers/github-by-repos'
+          service-account: '[email protected]'
+          docker-gcp-registries: us-west1-docker.pkg.dev
+      - name: Build and push container
+        uses: celo-org/reusable-workflows/.github/actions/[email protected]
+        with:
+          platforms: linux/amd64
+          registry: us-west1-docker.pkg.dev/devopsre/dev-images/cel2-migration-tool
+          tags: ${{ github.sha }}
+          context: ./
+          dockerfile: ./op-chain-ops/Dockerfile
+          push: true
+          trivy: false
+
+  # Build op-node op-batcher op-proposer using docker-bake
+  build-op-stack:
+    runs-on: ubuntu-latest
+    needs: detect-files-changed
+    if: |
+      contains(needs.detect-files-changed.outputs.files-changed, 'go.sum') ||
+      contains(needs.detect-files-changed.outputs.files-changed, 'ops/docker') ||
+      contains(needs.detect-files-changed.outputs.files-changed, 'op-node/') ||
+      contains(needs.detect-files-changed.outputs.files-changed, 'op-batcher/') ||
+      contains(needs.detect-files-changed.outputs.files-changed, 'op-proposer/') ||
+      contains(needs.detect-files-changed.outputs.files-changed, 'op-service/')
+    permissions:
+      contents: read
+      id-token: write
+      security-events: write
+    env:
+      GIT_COMMIT: ${{ github.sha }}
+      GIT_DATE: ${{ github.event.head_commit.timestamp }}
+      IMAGE_TAGS: ${{ github.sha }},latest
+      REGISTRY: us-west1-docker.pkg.dev
+      REPOSITORY: blockchaintestsglobaltestnet/dev-images
+    steps:
+      - uses: actions/checkout@v4
+      - name: Login at GCP Artifact Registry
+        uses: celo-org/reusable-workflows/.github/actions/[email protected]
+        with:
+          workload-id-provider: 'projects/1094498259535/locations/global/workloadIdentityPools/gh-optimism/providers/github-by-repos'
+          service-account: '[email protected]'
+          docker-gcp-registries: us-west1-docker.pkg.dev
+      # We need a custom steps as it's using docker bake
+      - name: Set up Docker Buildx
+        uses: docker/setup-buildx-action@v3
+      - name: Build and push
+        uses: docker/bake-action@v5
+        with:
+          push: true
+          source: .
+          files: docker-bake.hcl
+          targets: op-node,op-batcher,op-proposer
diff --git a/.gitignore b/.gitignore
@@ -46,3 +46,6 @@ __pycache__
 
 # Ignore echidna artifacts
 crytic-export
+
+# vscode
+.vscode/
diff --git a/op-chain-ops/Dockerfile b/op-chain-ops/Dockerfile
@@ -0,0 +1,29 @@
+FROM golang:1.21.1-alpine3.18 as builder
+
+RUN apk --no-cache add make
+
+COPY ./go.mod /app/go.mod
+COPY ./go.sum /app/go.sum
+
+WORKDIR /app
+
+RUN go mod download
+
+COPY ./op-service /app/op-service
+COPY ./op-node /app/op-node
+COPY ./op-plasma /app/op-plasma
+COPY ./op-chain-ops /app/op-chain-ops
+WORKDIR /app/op-chain-ops
+RUN make celo-migrate
+
+FROM alpine:3.18
+RUN apk --no-cache add ca-certificates bash rsync
+
+# RUN addgroup -S app && adduser -S app -G app
+# USER app
+WORKDIR /app
+
+COPY --from=builder /app/op-chain-ops/bin/celo-migrate /app
+ENV PATH="/app:${PATH}"
+
+ENTRYPOINT ["/app/celo-migrate"]
diff --git a/op-chain-ops/Makefile b/op-chain-ops/Makefile
@@ -32,6 +32,9 @@ ecotone-scalar:
 receipt-reference-builder:
 	go build -o ./bin/receipt-reference-builder ./cmd/receipt-reference-builder/*.go
 
+celo-migrate:
+	go build -o ./bin/celo-migrate ./cmd/celo-migrate/*.go
+
 test:
 	go test ./...
 

diff --git a/op-chain-ops/cmd/celo-migrate/README.md b/op-chain-ops/cmd/celo-migrate/README.md
@@ -0,0 +1,136 @@
+# Celo L2 Migration Script
+
+## Overview
+
+This script migrates a Celo L1 database (old datadir) into a new database compatible with Celo L2 (new datadir). It consists of 3 main processes that respectively migrate ancient blocks, non-ancient blocks and state. Migrated data is copied into a new datadir, leaving the old datadir unchanged.
+
+To minimize migration downtime, the script is designed to run in two stages:
+1. The `pre migration` stage can be run ahead of the `full migration` and will process as much of the migration as possible up to that point.
+2. The `full migration` can then be run to finish migrating new blocks that were created after the `pre migration` and apply necessary state changes on top of the migration block.
+
+### Pre migration
+
+The `pre migration` consists of two parts that are run in parallel:
+- Copy and transform the ancient / frozen blocks (i.e. all blocks before the last 90000).
+- Copy over the rest of the database using `rsync`.
+
+The ancients db is migrated sequentially because it is append-only, while the rest of the database is copied and then transformed in-place. We use `rsync` because it has flags for ignoring the ancients directory, skipping any already copied files and deleting any extra files in the new db, ensuring that we can run the script multiple times and only copy over actual updates.
+
+The `pre migration` step is still run during a `full migration` but it will be much quicker as only newly frozen blocks and recent file changes need to be migrated.
+
+### Full migration
+
+During the `full migration`, we re-run the `pre migration` step to capture any updates since the last `pre migration` and then apply in-place changes to non-ancient blocks and state. While this is happening, the script also checks for any stray ancient blocks that have remained in leveldb despite being frozen and removes them from the new db. Non-ancient blocks are then transformed to ensure compatibility with the L2 codebase.
+
+Finally after all blocks have been migrated, the script performs a series of modifications to the state db:
+1. First, it deploys the L2 smart contracts by iterating through the genesis allocs passed to the script and setting the nonce, balance, code and storage for each address accordingly, overwritting existing data if necessary.
+2. Finally, these changes are committed to the state db to produce a new state root and create the first Celo L2 block.
+
+### Notes
+
+> [!TIP]
+> See `--help` for how to run each portion of the script individually, along with other configuration options.
+
+The longest running section of the script is the ancients migration, followed by the `rsync` command. By running these together in a `pre migration` we greatly reduce how long they will take during the `full migration`. Changes made to non-ancient blocks and state during a `full migration` are erased by the next `rsync` command.
+
+The script outputs a `rollup-config.json` file that is passed to the sequencer in order to start the L2 network.
+
+### Running the script
+
+> [!NOTE]
+> You will need `rsync` to run this script if it's not already installed.
+
+From the `op-chain-ops` directory, first build the script by running:
+
+```bash
+make celo-migrate
+```
+
+You can then run the script as follows:
+
+```bash
+go run ./cmd/celo-migrate --help
+```
+
+#### Running with local test setup (Alfajores / Holesky)
+
+To test the script locally, we can migrate an Alfajores database and use Holesky as our L1. The input files needed for this can be found in `./testdata`. The necessary smart contracts have already been deployed on Holesky.
+
+##### Pull down the latest Alfajores database snapshot
+
+```bash
+gcloud alpha storage cp gs://celo-chain-backup/alfajores/chaindata-latest.tar.zst alfajores.tar.zst
+```
+
+Unzip and rename
+
+```bash
+tar --use-compress-program=unzstd -xvf alfajores.tar.zst
+mv chaindata ./data/alfajores_old
+```
+
+##### Generate test allocs file
+
+The state migration takes in an allocs file that specifies the l2 state changes to be made during the migration. This file can be generated from the deploy config and l1 contract addresses by running the following from the `contracts-bedrock` directory.
+
+```bash
+CONTRACT_ADDRESSES_PATH=../../op-chain-ops/cmd/celo-migrate/testdata/deployment-l1-dango.json \
+DEPLOY_CONFIG_PATH=../../op-chain-ops/cmd/celo-migrate/testdata/deploy-config-dango.json \
+STATE_DUMP_PATH=../../op-chain-ops/cmd/celo-migrate/testdata/l2-allocs-dango.json \
+forge script ./scripts/L2Genesis.s.sol:L2Genesis \
+--sig 'runWithStateDump()'
+```
+
+This should output the allocs file to `./testdata/l2-allocs-dango.json`. If you encounter difficulties with this and want to just continue testing the script, you can alternatively find the allocs file [here](https://storage.googleapis.com/cel2-rollup-files/alfajores-mvp/l2-allocs.json).
+
+##### Run script with test configuration
+
+```bash
+go run ./cmd/celo-migrate pre \
+--old-db ./data/alfajores_old \
+--new-db ./data/alfajores_new
+```
+
+Running the pre-migration script should take ~5 minutes. This script copies and transforms ancient blocks and, in parallel, copies over all other chaindata without transforming it. This can be re-run mutliple times leading up to the full migration, and should only migrate updates to the old db between re-runs.
+
+```bash
+go run ./cmd/celo-migrate full \
+--deploy-config ./cmd/celo-migrate/testdata/deploy-config-dango.json \
+--l1-deployments ./cmd/celo-migrate/testdata/deployment-l1-dango.json \
+--l1-rpc https://ethereum-holesky-rpc.publicnode.com  \
+--l2-allocs ./cmd/celo-migrate/testdata/l2-allocs-dango.json \
+--outfile.rollup-config ./cmd/celo-migrate/testdata/rollup-config-dango.json \
+--old-db ./data/alfajores_old \
+--new-db ./data/alfajores_new
+```
+
+Running the full migration script re-runs the pre-migration script once to migrate any new changes to the old db that have occurred since the last pre-migration. It then performs in-place transformations on the non-ancient blocks and performs the state migration as well.
+
+#### Running for Cel2 migration
+
+##### Generate allocs file
+
+You can generate the allocs file needed to run the migration with the following script in `contracts-bedrock`
+
+```bash
+CONTRACT_ADDRESSES_PATH=<PATH_TO_CONTRACT_ADDRESSES> \
+DEPLOY_CONFIG_PATH=<PATH_TO_MY_DEPLOY_CONFIG> \
+STATE_DUMP_PATH=<PATH_TO_WRITE_L2_ALLOCS> \
+forge script scripts/L2Genesis.s.sol:L2Genesis \
+--sig 'runWithStateDump()'
+```
+
+##### Dry-run / pre-migration
+
+To minimize downtime caused by the migration, node operators can prepare their Cel2 databases by running the pre-migration command a day ahead of the actual migration. This will pre-populate the new database with most of the ancient blocks needed for the final migration and copy over other chaindata without transforming it.
+
+If node operators would like to practice a `full migration` they can do so and reset their databases to the correct state by running another `pre migration` afterward.
+
+> [!IMPORTANT]
+> The pre-migration should be run using a chaindata snapshot, rather than a db that is being used by a node. To avoid network downtime, we recommend that node operators do not stop any nodes in order to perform the pre-migration.
+
+Node operators should inspect their migration logs after the dry-run to ensure the migration completed succesfully and direct any questions to the Celo developer community on Discord before the actual migration.
+
+##### Final migration
+
+On the day of the actual Cel2 migration, the `full migration` script can be run using the datadir of a Celo L1 node that has halted on the migration block. Far in advance of the migration, a version of `celo-blockchain` will be distributed where a flag can specify a block to halt on. When the Celo community aligns on a migration block, node operators will start / restart their nodes with this flag specifying the migration block. Their nodes will halt when this block is reached, at which point they will be able to run `full migration` and begin syncing with the Celo L2 network.