This guide provides step-by-step instructions for setting up a fresh PostgreSQL database and importing Mirror Node data into it using the bootstrap.sh
script and bootstrap.env
configuration file. The process involves initializing the database, configuring environment variables, and running the import script. The data import is a long-running process, so it's important to ensure it continues running even if your SSH session is terminated.
- Prerequisites
- Database Initialization and Data Import
- Handling Failed Imports
- Additional Notes
- Troubleshooting
-
PostgreSQL 16 installed and running.
-
Access to a machine where you can run the initialization and import scripts and connect to the PostgreSQL database.
-
Ensure the following tools are installed on your machine:
psql
gunzip
realpath
flock
curl
-
Install the Google Cloud SDK, then authenticate:
gcloud auth login
-
A Google Cloud Platform (GCP) account with a valid billing account attached (required for downloading data from a Requester Pays bucket). For detailed instructions on obtaining the necessary GCP information, refer to Hedera's documentation.
Download the bootstrap.sh
script and the bootstrap.env
configuration file. The bootstrap.env
file comes with default values and needs to be edited to set your specific configurations.
Steps:
-
Download
bootstrap.sh
andbootstrap.env
:curl -O https://raw.githubusercontent.com/hashgraph/hedera-mirror-node/main/hedera-mirror-importer/src/main/resources/db/scripts/bootstrap.sh \ -O https://raw.githubusercontent.com/hashgraph/hedera-mirror-node/main/hedera-mirror-importer/src/main/resources/db/scripts/bootstrap.env chmod +x bootstrap.sh
Edit the bootstrap.env
file to set your own credentials and passwords for database users during initialization.
Instructions:
-
Set PostgreSQL Environment Variables:
# PostgreSQL environment variables export PGUSER="postgres" export PGPASSWORD="your_postgres_password" export PGDATABASE="postgres" export PGHOST="127.0.0.1" export PGPORT="5432"
- Replace
your_postgres_password
with the password for the PostgreSQL superuser (postgres
). PGHOST
should be set to the IP address or hostname of your PostgreSQL server.
- Replace
-
Set the
IS_GCP_CLOUD_SQL
variable totrue
if you are using a GCP Cloud SQL database:# Is the DB a GCP Cloud SQL instance? export IS_GCP_CLOUD_SQL="true"
- Otherwise, leave it as
false
.
- Otherwise, leave it as
-
Set Database User Passwords:
# Set DB users' passwords export GRAPHQL_PASSWORD="SET_PASSWORD" export GRPC_PASSWORD="SET_PASSWORD" export IMPORTER_PASSWORD="SET_PASSWORD" export OWNER_PASSWORD="SET_PASSWORD" export REST_PASSWORD="SET_PASSWORD" export REST_JAVA_PASSWORD="SET_PASSWORD" export ROSETTA_PASSWORD="SET_PASSWORD" export WEB3_PASSWORD="SET_PASSWORD"
- Replace each
SET_PASSWORD
with a strong, unique password for each respective database user.
- Replace each
-
Save and Secure the
bootstrap.env
File:-
After editing, save the file.
-
Ensure that the
bootstrap.env
file is secured and not accessible to unauthorized users, as it contains sensitive information.chmod 600 bootstrap.env
-
The Mirror Node database export data is available in a Google Cloud Storage (GCS) bucket:
- Bucket URL: mirrornode-db-export
Important Notes:
- The bucket is read-only to the public.
- It is configured as Requester Pays, meaning you need a GCP account with a valid billing account attached to download the data. For detailed instructions, refer to Hedera's documentation on GCS.
- You will be billed for the data transfer fees incurred during the download.
gcloud config set project YOUR_GCP_PROJECT_ID
- Replace YOUR_GCP_PROJECT_ID with your actual GCP project ID.
To see the available versions of the database export, list the contents of the bucket:
gsutil -m ls gs://mirrornode-db-export/
This will display the available version directories.
-
Select the latest available version from the output of the previous command.
- Legacy versions will be removed from the bucket shortly after a newer version's export data becomes available.
-
Ensure Compatibility:
- The mirror node must be initially deployed and started against the same version of the database export.
- Be aware that using mismatched versions may lead to compatibility issues and schema mismatches.
Create a directory to store the data and download all files and subdirectories for the selected version:
mkdir -p /path/to/db_export
gsutil -m cp -r gs://mirrornode-db-export/<VERSION_NUMBER>/* /path/to/db_export/
- Replace
/path/to/db_export
with your desired directory path. - Replace
<VERSION_NUMBER>
with the version you selected (e.g.,0.111.0
). - Ensure all files and subdirectories are downloaded into this single parent directory.
- Note: The
-m
flag enables parallel downloads to speed up the process.
After downloading the data, it's crucial to ensure version compatibility between the database export and the Mirror Node you're setting up.
Steps:
-
Locate the
MIRRORNODE_VERSION
File:- The downloaded data should include a file named
MIRRORNODE_VERSION
in the root of the/path/to/db_export
directory.
- The downloaded data should include a file named
-
Check the Mirror Node Version:
cat /path/to/db_export/MIRRORNODE_VERSION
-
Ensure Version Compatibility:
- The version number in the
MIRRORNODE_VERSION
file should match the name of the directory from which you downloaded the data, and should also be the version of the Mirror Node you are initializing with this export's data.
- The version number in the
The bootstrap.sh
script initializes the database and imports the data. It is designed to be a one-stop solution for setting up your Mirror Node database.
Instructions:
-
Ensure You Have
bootstrap.sh
andbootstrap.env
in the Same Directory:ls -l bootstrap.* # Should list bootstrap.sh and bootstrap.env
-
Run the Bootstrap Script Using
nohup
and Redirect Output tobootstrap.log
:To ensure the script continues running even if your SSH session is terminated, run it using
nohup
, redirect stdout and stderr tobootstrap.log
, and save its process ID (PID) to a file.nohup setsid ./bootstrap.sh 8 /path/to/db_export > /dev/null 2>> bootstrap.log &
-
The script handles logging internally to
bootstrap.log
, and the execution command will also append stdout/stderr of the script itself to the log file. -
8
refers to the number of CPU cores to use for parallel processing. Adjust this number based on your system's resources. -
/path/to/db_export
is the directory where you downloaded the database export data. -
bootstrap.pid
stores the PID of the running script for later use. -
Important: The SKIP_DB_INIT flag file is automatically created by the script after a successful database initialization. Do not manually create or delete this file. If you need to force the script to reinitialize the database in future runs, remove the flag file using:
rm -f SKIP_DB_INIT
-
-
Verify the Script is Running:
tail -f bootstrap.log
- Monitor the progress and check for any errors.
-
Disconnect Your SSH Session (Optional):
You can safely close your SSH session. The script will continue running in the background.
-
Check the Log File:
tail -f bootstrap.log
- The script logs all activity to
bootstrap.log
. - Note that the script processes files in parallel and asynchronously. Activities are logged as they occur, so log entries may appear in an arbitrary order.
- The script logs all activity to
-
Check the Tracking File:
cat bootstrap_tracking.txt
- This file tracks the status of each file being imported.
If you need to stop the script before it completes:
-
Gracefully Terminate the Script and All Child Processes:
kill -TERM -- -$(cat bootstrap.pid)
- Sends the
SIGTERM
signal to the entire process group. - Allows the script and all its background processes to perform cleanup and exit gracefully.
- Sends the
-
If the Script Doesn't Stop, Force Termination of the Process Group:
kill -KILL -$(cat bootstrap.pid)
- Sends the
SIGKILL
signal to the entire process group. - Immediately terminates the script, however may leave some background jobs running; It is recommended to use the first method.
- Sends the
Note: Ensure that bootstrap.sh
is designed to handle termination signals and clean up its child processes appropriately.
-
Re-run the Bootstrap Script:
nohup setsid ./bootstrap.sh 8 /path/to/db_export > /dev/null 2>> bootstrap.log &
- The script will resume where it left off, skipping files that have already been imported successfully.
- Once the bootstrap process completes without errors, you may start the Mirrornode Importer.
During the import process, the script generates a file named bootstrap_tracking.txt
, which logs the status of each file import. Each line in this file contains the path and name of a file, followed by its import status: NOT_STARTED
, IN_PROGRESS
, IMPORTED
, or FAILED_TO_IMPORT
.
Example of bootstrap_tracking.txt
:
/path/to/db_export/record_file.csv.gz IMPORTED
/path/to/db_export/transaction/transaction_part_1.csv.gz IMPORTED
/path/to/db_export/transaction/transaction_part_2.csv.gz FAILED_TO_IMPORT
/path/to/db_export/account.csv.gz NOT_STARTED
Notes on Data Consistency:
-
Automatic Retry: When you re-run the
bootstrap.sh
script, it will automatically attempt to import files marked asNOT_STARTED
,IN_PROGRESS
, orFAILED_TO_IMPORT
. -
Data Integrity: The script ensures that no partial data is committed in case of an import failure.
-
Concurrent Write Safety: The script uses file locking (
flock
) to safely handle concurrent writes tobootstrap_tracking.txt
.
-
System Resources:
- Adjust the number of CPU cores used (
8
in the example) based on your system's capabilities. - Monitor system resources during the import process to ensure optimal performance.
- Adjust the number of CPU cores used (
-
Security Considerations:
- Secure your
bootstrap.env
file and any other files containing sensitive information.
- Secure your
-
Environment Variables:
- Ensure
bootstrap.env
is in the same directory asbootstrap.sh
.
- Ensure
-
Connection Errors:
- Confirm that
PGHOST
inbootstrap.env
is correctly set. - Ensure that the database server allows connections from your client machine.
- Verify that the database port (
PGPORT
) is correct and accessible.
- Confirm that
-
Import Failures:
- Review
bootstrap.log
for detailed error messages. - Check
bootstrap_tracking.txt
to identify which files failed to import. - Re-run the
bootstrap.sh
script to retry importing failed files.
- Review
-
Permission Denied Errors:
- Ensure that the user specified in
PGUSER
has the necessary permissions to create databases and roles. - Verify that file permissions allow the script to read and write to the necessary directories and files.
- Ensure that the user specified in
-
Environment Variable Issues:
- Double-check that all required variables in
bootstrap.env
are correctly set and exported. - Ensure there are no typos or missing variables.
- Double-check that all required variables in
-
Script Does Not Continue After SSH Disconnect:
-
Ensure you used
nohup
when running the script. -
Confirm that the script is running by checking the process list:
ps -p $(cat bootstrap.pid)
-