Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal Validator - sequencer and stake manager error #2208

Open
BlockgenStudio opened this issue Nov 5, 2024 · 0 comments
Open

Internal Validator - sequencer and stake manager error #2208

BlockgenStudio opened this issue Nov 5, 2024 · 0 comments

Comments

@BlockgenStudio
Copy link

Description:

For the past three weeks, we have been experiencing recurring errors on one of our validators, validator-002, in our production environment. We identified three specific errors in the logs, detailed below:

Error Logs:

  1. Panic Error
    panic: runtime error: slice bounds out of range [:32] with capacity 0

    • Frequency: ~10 occurrences per week

image

  1. Failed to Run Sequence Error
    failed to run sequence - validator manager init: height=17657995 error="getting voting power failed - backend is not initialized for height 17657995, fsm height 17657994"

    • Frequency: ~20 occurrences per week

image

  1. Post Block in Stake Manager Error
    polygon.server.polybft.consensus_runtime: failed to post block in stake manager: err="not found"

    • Frequency: Appears on every block sequence

image

These logs are from validator-002 for the time period from October 28th to November 4th.

How to Reproduce the Issue:
Below are the setup and resource details used to set up our Polygon Supernet, along with relevant environment details:

Infrastructure Setup:

  • Total Nodes: 7 Validators, 3 Non-validators on an internal network
  • Validator Configuration:
    • 5 Validators in a private subnet (genesis validators)
    • 1 Validator in a public subnet
    • 1 External Validator hosted outside the VPC (connected via an RPC from a publicly exposed RPC node to an internal genesis validator)
  • Non-Validator Configuration:
    • 2 Non-validators connected to a load balancer, used as RPC nodes
    • 1 Non-validator connected to a block explorer

Resource Details:

  • Validator Instance Type: c6i.large
  • Non-validator Instance Type (RPC nodes): c6i.xlarge
  • Non-validator Instance Type (Block explorer): c6i.2xlarge
  • Operating System: Ubuntu 22.04 LTS
  • Polygon Edge Version: v1.0.0

Impact and Urgency:
These errors are impacting our production environment. The failed to run sequence error appears multiple times a week and has potential implications for validator stability. Additionally, the post block in stake manager error affects every block sequence, which is a significant operational concern.

Request for Assistance:
Could you provide any guidance on troubleshooting or potential fixes for these issues? If additional logs or specific configurations are needed, please let us know.
Thank you for your assistance!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant