Skip to content

Commit

Permalink
Add celo-migrate script
Browse files Browse the repository at this point in the history
This works by loading the database of a celo
node. It then removes all existing blocks and
generates a new genesis block including the
existing state tree.

Migrate to urfave/cli/v2

Update op-chain-ops/cmd/op-migrate/main.go

Co-authored-by: Karl Bartel <[email protected]>

Combine Cel2 migration scripts (#148)

* Initial script to play with celo DB history migration

* Can Read All the headers

Co-authored-by: Alec Schaefer <[email protected]>

* Adds new command to migrate ancients db

* Adds comment

* Adds extension methods for transformation

* Implements Transform CeloBody

* Adds impl that runs steps in a concurrent pipeline

* Adds transformHead, verify hashing works

cleanup

* add migration for non-frozen blocks

* copy over entire db and modify in place, works with op-geth at piersy/minimal-data-migration

* remove unecessary copying, cleanup code

* close and reopen DBs

* migrate newdb in place

* saving progress

Co-authored-by: Mariano Cortesi <[email protected]>

* Refactor code to improve database migration process

* better logging

* refactor: inline parMigrateAncientRange

* Remove frozen blocks from nonAncient DB

* check hash matches on nonAncients migration

* clean up branch

Removes unused code, move code for better separation of concerns.

* decode into new types

* fix transformHeader

* make old freezer not readonly so that .meta files are created

* add configurable memory limit

* add comment about memory

* Added celo-dbmigrate Makefile target

* Added dockerfile for celo-dbmigrate and celo-migrate tools

* Workflow for running cel2-migration-tool

* Update cel2-migration-tool image registry

* update op-geth to point to https://github.com/celo-org/op-geth/commits/piersy/for-use-with-migrated-celo-datadir-use-gas-limit-differentiation-rebased-celo6/

* add celo6 logging

* rename scripts to celo-migrate-state and celo-migrate-blocks

* first pass at combining scripts

* saving progress on testing

* fix lint error, use %w to fmt errors

* add updated state migration input files to testdata

* add ability to run block and state migration seperately or together

* add option for migrating only frozen blocks

* remove old scripts

* minor logging improvements in block migrations

* invert clearNonAncients flag logic --> keepNonAncients, make dry-run flag only apply to state migration

* adds README, improves logging

* fix lint err

* Fix Makefile and Dockerfile

* move createNewDbIfNotExists

* rename keep-non-ancients

* update TODO to add more context and state changes

* Remove channel buffers from ancients migration

Co-authored-by: Valentin Rodygin <[email protected]>

* bump default batch size to 100000

* add back extended usage string

* add info on state migration to README

* remove --state-dry-run flag

* update default batch size to 50k

* Adding building for op images

* Setting our values for image registry and repository

* update README

* fix logging when newAncients > oldAncients

* fix return value when skipping ancients

* skip transforming block bodies that have already been transformed

* misc. fixes to get re-runs with --keep-non-ancients working

* adds TODO

* addresses cosmetic feedback

* add flag for specifying a buffer

* Show progress on rsync

* Update to latest op-geth

* state-migration: Refactor subtask

* state-migration: Use EIP1559 settings from deploy config

Fixes #135

* state-migration: Enable Fjord hardfork during migration

Fixes #160

* state-migration: Deterministicly set migration block timestamp

Fixes #157

Sets the timestamp to be 5s after the last block.

* state-migration: Set WithdrawalsHash in Cel2 migration block

* fixup! Fix Makefile and Dockerfile

* add note to README about using snapshots for pre-migration

* Set blob gas header fields for transition block

These are now required to be set since cancun was activated.

* Use InitialBaseFee for pre-gingerbread transitionb

* Fix warnings about capitalized error strings

* Output chain config as marshalled JSON

* state-migration: Handle accounts with existing balance

Fixes #158

* remove allocs file, add instructions for how to generate allocs file to README, update TODOs

---------

Co-authored-by: Mariano Cortesi <[email protected]>
Co-authored-by: Alec Schaefer <[email protected]>
Co-authored-by: Mariano Cortesi <[email protected]>
Co-authored-by: Javier Cortejoso <[email protected]>
Co-authored-by: Paul Lange <[email protected]>
Co-authored-by: Valentin Rodygin <[email protected]>
Co-authored-by: Piers Powlesland <[email protected]>

Set balance of `CeloDistributionSchedule` contract (#162)

* state-migration: Initialize CeloDistributionSchedule

Fixes #155

* state-migration: Don't fail when distribution schedule update errors

* Review comments

state-migration: Set ParentBeaconRoot (#176)

This allows header validation to pass during snap sync

state-migration: Set address of distribution schedule (#177)

state-migration: Read total supply directly from state (#182)

* state-migration: Read totalSupply directly from storage

* Added trigger for updated dependencies

* Removen token bindings

---------

Co-authored-by: Javier Cortejoso <[email protected]>

Fix l2 block older than l1 origin error (#184) (#187)

* Revert to using time.Now() for migration block

Instead of simply adding 5 to the parent block time.

We really do need a deterministic time for the migration block so that
all parties that run the migration arrive at the same migration block
but the problem is that op-geth requires that the L2 migration block
(aka l2 origin) occurs after the l1 origin (I guess the point where you
deploy the bridge contracts to the l1). When we migrate a partially
synced datadir the block before the transition block will be very old,
up to 4 years old! So of course it occurs before the l1 origin. So a fix
just to get things working is to use time.Now(), but probably we should
make this a configurable parameter.

* add flag to specify timestamp

* Update op-chain-ops/cmd/celo-migrate/main.go

---------

Co-authored-by: piersy <[email protected]>

Migration script fixes (#179)

* Fixed migration for datadirs without ancients

The script was assuming that ancients would have been migrated and was
considering the numAncients-1 to be the next block to migrate but when
numAncients is zero that's a problem.

Also remved logic for  picking up where db migration left of for the
level db since it was complicating the logic and that process takes a
few seconds, which is nothing compared with the minutes taken to migrate
the ancients.

* Ensure that we set gas limit if migrating at pre-gingerbread point

Fix migration script gap in migrated blocks (#189)

* Fix migration script gap in migrated blocks

The range of ancient blocks to remove from the non ancients database was
off by one and resulted in a gap between ancients and non ancients.

Also corrected some log statements that were off by one.

Add pre-migration command to migration script (#192)

* add pre-migration command, rsync and ancients run in parallel, remove onlyAncients flag

* remove block and state migration sub-commands

* make non ancient migration its own step, add flag to measure time

* add more granular timers

* open db without freezer in state migration, remove clearAll

* fix error

* remove update flag from rsync command, add rsync comments

* delete commented out versions of checkForPrevFullMigration

* remove aliases

* remove clearNonAncients flag

* remove measureTime flag, always log time measurements

* remove logging from help text

* remove db reset

* move scan for extra ancients into pre-migration

* update README

* rename extraAncientNumHashes to strayAncientBlocks

state-migration: Fail if account would be overwritten (#202)

* state-migration: Fail if account would be overwritten

* Review changes

* Review changes 2

* Fail in unclear state

* more changes

* Use whitelist to decide if nonce and state are overwritten

Cosmetic changes to the migration script

- Use more lists for added readability
- Capitalize Alfajores and Celo
- Reorder scripting instructions to fit the actual order or operations
- Use GitHub callouts

migration: Add tests (#217)

* migration: Add tests for state migration

* migration: Fix issues shown by tests

* migration: pass allowlist into state migration

Allows for easier testing

* migration: Add test with allowlist

* Correct overwrite counter

* Use in memory DB

migration: Add working allowlist for Alfajores (#220)

* migration: Simplify tests

* migration: Add working allowlist for Alfajores

Adapt migration code to changes in StateDB

StateDB.CreateAccount used to copy existing balance, now it does not any
more.

migration: Set fields correctly for migration block (#212)

migration: Enable Granite (#226)

Write genesis file in state migration (#219)

* squash of #167

* add writeGenesis

* open old freezer in readonly mode, fix locking error

* remove devAlloc

* Revert "open old freezer in readonly mode, fix locking error"

This reverts commit e3fddea.

* fix locking error

* fix lint error, check errors, add comment

* remove comment

* filter extra genesis fields

* fix issue with genesis extra data

* update testdata

---------

Co-authored-by: Javier Cortejoso <[email protected]>

migration: Overwrite create2deployer code (#233)

migration: Allow 'createx' preinstall (#238)

The code already exists on Alfajores and matches the one that would be
deployed, therefore we just allow this address.

add migration-block-number flag (#245)

* add migration-block-number flag

* address feedback

* move migration-block-number flag out of state migration options

Fixes for re-running migration script on same destination db  (#246)

* add reset flag

* add --checksum to rsync options
  • Loading branch information
palango authored and alecps committed Oct 15, 2024
1 parent 6886911 commit dfd6802
Show file tree
Hide file tree
Showing 21 changed files with 2,895 additions and 87 deletions.
172 changes: 85 additions & 87 deletions .github/workflows/docker-build-scan.yaml
Original file line number Diff line number Diff line change
@@ -1,92 +1,90 @@
name: Docker Build Scan
on:
pull_request:
branches:
- 'master'
- 'celo*'
workflow_dispatch:

jobs:
Build-Scan-Container-op-ufm:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: op-ufm/Dockerfile

Build-Scan-Container-ops-bedrock-l1:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: ops-bedrock/Dockerfile.l1
context: ops-bedrock

Build-Scan-Container-ops-bedrock-l2:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: ops-bedrock/Dockerfile.l2
context: ops-bedrock

Build-Scan-Container-indexer:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: indexer/Dockerfile

Build-Scan-Container-op-heartbeat:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: op-heartbeat/Dockerfile

Build-Scan-Container-op-exporter:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: op-exporter/Dockerfile

Build-Scan-Container-op-program:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: op-program/Dockerfile

Build-Scan-Container-ops-bedrock:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: ops-bedrock/Dockerfile.stateviz

Build-Scan-Container-ci-builder:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: ops/docker/ci-builder/Dockerfile

Build-Scan-Container-proxyd:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: proxyd/Dockerfile

Build-Scan-Container-op-node:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: op-node/Dockerfile

Build-Scan-Container-op-batcher:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: op-batcher/Dockerfile

Build-Scan-Container-indexer-ui:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: indexer/ui/Dockerfile

Build-Scan-Container-op-proposer:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: op-proposer/Dockerfile

Build-Scan-Container-op-challenger:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: op-challenger/Dockerfile

Build-Scan-Container-endpoint-monitor:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: endpoint-monitor/Dockerfile

Build-Scan-Container-opwheel:
uses: celo-org/reusable-workflows/.github/workflows/[email protected]
with:
dockerfile: op-wheel/Dockerfile

detect-files-changed:
runs-on: ubuntu-latest
outputs:
files-changed: ${{ steps.detect-files-changed.outputs.all_changed_files }}
steps:
- uses: actions/checkout@v4
- name: Detect files changed
id: detect-files-changed
uses: tj-actions/changed-files@v44
with:
separator: ','

build-cel2-migration-tool:
runs-on: ubuntu-latest
needs: detect-files-changed
if: |
contains(needs.detect-files-changed.outputs.files-changed, 'go.sum') ||
contains(needs.detect-files-changed.outputs.files-changed, 'op-chain-ops/cmd/celo-migrate') ||
contains(needs.detect-files-changed.outputs.files-changed, 'op-chain-ops/Dockerfile')
permissions:
contents: read
id-token: write
security-events: write
steps:
- uses: actions/checkout@v4
- name: Login at GCP Artifact Registry
uses: celo-org/reusable-workflows/.github/actions/[email protected]
with:
workload-id-provider: 'projects/1094498259535/locations/global/workloadIdentityPools/gh-optimism/providers/github-by-repos'
service-account: '[email protected]'
docker-gcp-registries: us-west1-docker.pkg.dev
- name: Build and push container
uses: celo-org/reusable-workflows/.github/actions/[email protected]
with:
platforms: linux/amd64
registry: us-west1-docker.pkg.dev/devopsre/dev-images/cel2-migration-tool
tags: ${{ github.sha }}
context: ./
dockerfile: ./op-chain-ops/Dockerfile
push: true
trivy: false

# Build op-node op-batcher op-proposer using docker-bake
build-op-stack:
runs-on: ubuntu-latest
needs: detect-files-changed
if: |
contains(needs.detect-files-changed.outputs.files-changed, 'go.sum') ||
contains(needs.detect-files-changed.outputs.files-changed, 'ops/docker') ||
contains(needs.detect-files-changed.outputs.files-changed, 'op-node/') ||
contains(needs.detect-files-changed.outputs.files-changed, 'op-batcher/') ||
contains(needs.detect-files-changed.outputs.files-changed, 'op-proposer/') ||
contains(needs.detect-files-changed.outputs.files-changed, 'op-service/')
permissions:
contents: read
id-token: write
security-events: write
env:
GIT_COMMIT: ${{ github.sha }}
GIT_DATE: ${{ github.event.head_commit.timestamp }}
IMAGE_TAGS: ${{ github.sha }},latest
REGISTRY: us-west1-docker.pkg.dev
REPOSITORY: blockchaintestsglobaltestnet/dev-images
steps:
- uses: actions/checkout@v4
- name: Login at GCP Artifact Registry
uses: celo-org/reusable-workflows/.github/actions/[email protected]
with:
workload-id-provider: 'projects/1094498259535/locations/global/workloadIdentityPools/gh-optimism/providers/github-by-repos'
service-account: '[email protected]'
docker-gcp-registries: us-west1-docker.pkg.dev
# We need a custom steps as it's using docker bake
- name: Set up Docker Buildx
uses: docker/setup-buildx-action@v3
- name: Build and push
uses: docker/bake-action@v5
with:
push: true
source: .
files: docker-bake.hcl
targets: op-node,op-batcher,op-proposer
3 changes: 3 additions & 0 deletions .gitignore
Original file line number Diff line number Diff line change
Expand Up @@ -46,3 +46,6 @@ __pycache__

# Ignore echidna artifacts
crytic-export

# vscode
.vscode/
29 changes: 29 additions & 0 deletions op-chain-ops/Dockerfile
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
FROM golang:1.21.1-alpine3.18 as builder

RUN apk --no-cache add make

COPY ./go.mod /app/go.mod
COPY ./go.sum /app/go.sum

WORKDIR /app

RUN go mod download

COPY ./op-service /app/op-service
COPY ./op-node /app/op-node
COPY ./op-plasma /app/op-plasma
COPY ./op-chain-ops /app/op-chain-ops
WORKDIR /app/op-chain-ops
RUN make celo-migrate

FROM alpine:3.18
RUN apk --no-cache add ca-certificates bash rsync

# RUN addgroup -S app && adduser -S app -G app
# USER app
WORKDIR /app

COPY --from=builder /app/op-chain-ops/bin/celo-migrate /app
ENV PATH="/app:${PATH}"

ENTRYPOINT ["/app/celo-migrate"]
3 changes: 3 additions & 0 deletions op-chain-ops/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -32,6 +32,9 @@ ecotone-scalar:
receipt-reference-builder:
go build -o ./bin/receipt-reference-builder ./cmd/receipt-reference-builder/*.go

celo-migrate:
go build -o ./bin/celo-migrate ./cmd/celo-migrate/*.go

test:
go test ./...

Expand Down
136 changes: 136 additions & 0 deletions op-chain-ops/cmd/celo-migrate/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,136 @@
# Celo L2 Migration Script

## Overview

This script migrates a Celo L1 database (old datadir) into a new database compatible with Celo L2 (new datadir). It consists of 3 main processes that respectively migrate ancient blocks, non-ancient blocks and state. Migrated data is copied into a new datadir, leaving the old datadir unchanged.

To minimize migration downtime, the script is designed to run in two stages:
1. The `pre migration` stage can be run ahead of the `full migration` and will process as much of the migration as possible up to that point.
2. The `full migration` can then be run to finish migrating new blocks that were created after the `pre migration` and apply necessary state changes on top of the migration block.

### Pre migration

The `pre migration` consists of two parts that are run in parallel:
- Copy and transform the ancient / frozen blocks (i.e. all blocks before the last 90000).
- Copy over the rest of the database using `rsync`.

The ancients db is migrated sequentially because it is append-only, while the rest of the database is copied and then transformed in-place. We use `rsync` because it has flags for ignoring the ancients directory, skipping any already copied files and deleting any extra files in the new db, ensuring that we can run the script multiple times and only copy over actual updates.

The `pre migration` step is still run during a `full migration` but it will be much quicker as only newly frozen blocks and recent file changes need to be migrated.

### Full migration

During the `full migration`, we re-run the `pre migration` step to capture any updates since the last `pre migration` and then apply in-place changes to non-ancient blocks and state. While this is happening, the script also checks for any stray ancient blocks that have remained in leveldb despite being frozen and removes them from the new db. Non-ancient blocks are then transformed to ensure compatibility with the L2 codebase.

Finally after all blocks have been migrated, the script performs a series of modifications to the state db:
1. First, it deploys the L2 smart contracts by iterating through the genesis allocs passed to the script and setting the nonce, balance, code and storage for each address accordingly, overwritting existing data if necessary.
2. Finally, these changes are committed to the state db to produce a new state root and create the first Celo L2 block.

### Notes

> [!TIP]
> See `--help` for how to run each portion of the script individually, along with other configuration options.
The longest running section of the script is the ancients migration, followed by the `rsync` command. By running these together in a `pre migration` we greatly reduce how long they will take during the `full migration`. Changes made to non-ancient blocks and state during a `full migration` are erased by the next `rsync` command.

The script outputs a `rollup-config.json` file that is passed to the sequencer in order to start the L2 network.

### Running the script

> [!NOTE]
> You will need `rsync` to run this script if it's not already installed.
From the `op-chain-ops` directory, first build the script by running:

```bash
make celo-migrate
```

You can then run the script as follows:

```bash
go run ./cmd/celo-migrate --help
```

#### Running with local test setup (Alfajores / Holesky)

To test the script locally, we can migrate an Alfajores database and use Holesky as our L1. The input files needed for this can be found in `./testdata`. The necessary smart contracts have already been deployed on Holesky.

##### Pull down the latest Alfajores database snapshot

```bash
gcloud alpha storage cp gs://celo-chain-backup/alfajores/chaindata-latest.tar.zst alfajores.tar.zst
```

Unzip and rename

```bash
tar --use-compress-program=unzstd -xvf alfajores.tar.zst
mv chaindata ./data/alfajores_old
```

##### Generate test allocs file

The state migration takes in an allocs file that specifies the l2 state changes to be made during the migration. This file can be generated from the deploy config and l1 contract addresses by running the following from the `contracts-bedrock` directory.

```bash
CONTRACT_ADDRESSES_PATH=../../op-chain-ops/cmd/celo-migrate/testdata/deployment-l1-dango.json \
DEPLOY_CONFIG_PATH=../../op-chain-ops/cmd/celo-migrate/testdata/deploy-config-dango.json \
STATE_DUMP_PATH=../../op-chain-ops/cmd/celo-migrate/testdata/l2-allocs-dango.json \
forge script ./scripts/L2Genesis.s.sol:L2Genesis \
--sig 'runWithStateDump()'
```

This should output the allocs file to `./testdata/l2-allocs-dango.json`. If you encounter difficulties with this and want to just continue testing the script, you can alternatively find the allocs file [here](https://storage.googleapis.com/cel2-rollup-files/alfajores-mvp/l2-allocs.json).

##### Run script with test configuration

```bash
go run ./cmd/celo-migrate pre \
--old-db ./data/alfajores_old \
--new-db ./data/alfajores_new
```

Running the pre-migration script should take ~5 minutes. This script copies and transforms ancient blocks and, in parallel, copies over all other chaindata without transforming it. This can be re-run mutliple times leading up to the full migration, and should only migrate updates to the old db between re-runs.

```bash
go run ./cmd/celo-migrate full \
--deploy-config ./cmd/celo-migrate/testdata/deploy-config-dango.json \
--l1-deployments ./cmd/celo-migrate/testdata/deployment-l1-dango.json \
--l1-rpc https://ethereum-holesky-rpc.publicnode.com \
--l2-allocs ./cmd/celo-migrate/testdata/l2-allocs-dango.json \
--outfile.rollup-config ./cmd/celo-migrate/testdata/rollup-config-dango.json \
--old-db ./data/alfajores_old \
--new-db ./data/alfajores_new
```

Running the full migration script re-runs the pre-migration script once to migrate any new changes to the old db that have occurred since the last pre-migration. It then performs in-place transformations on the non-ancient blocks and performs the state migration as well.

#### Running for Cel2 migration

##### Generate allocs file

You can generate the allocs file needed to run the migration with the following script in `contracts-bedrock`

```bash
CONTRACT_ADDRESSES_PATH=<PATH_TO_CONTRACT_ADDRESSES> \
DEPLOY_CONFIG_PATH=<PATH_TO_MY_DEPLOY_CONFIG> \
STATE_DUMP_PATH=<PATH_TO_WRITE_L2_ALLOCS> \
forge script scripts/L2Genesis.s.sol:L2Genesis \
--sig 'runWithStateDump()'
```

##### Dry-run / pre-migration

To minimize downtime caused by the migration, node operators can prepare their Cel2 databases by running the pre-migration command a day ahead of the actual migration. This will pre-populate the new database with most of the ancient blocks needed for the final migration and copy over other chaindata without transforming it.

If node operators would like to practice a `full migration` they can do so and reset their databases to the correct state by running another `pre migration` afterward.

> [!IMPORTANT]
> The pre-migration should be run using a chaindata snapshot, rather than a db that is being used by a node. To avoid network downtime, we recommend that node operators do not stop any nodes in order to perform the pre-migration.
Node operators should inspect their migration logs after the dry-run to ensure the migration completed succesfully and direct any questions to the Celo developer community on Discord before the actual migration.

##### Final migration

On the day of the actual Cel2 migration, the `full migration` script can be run using the datadir of a Celo L1 node that has halted on the migration block. Far in advance of the migration, a version of `celo-blockchain` will be distributed where a flag can specify a block to halt on. When the Celo community aligns on a migration block, node operators will start / restart their nodes with this flag specifying the migration block. Their nodes will halt when this block is reached, at which point they will be able to run `full migration` and begin syncing with the Celo L2 network.
Loading

0 comments on commit dfd6802

Please sign in to comment.