Skip to content

Latest commit

 

History

History
279 lines (225 loc) · 8.68 KB

06-monitoring.adoc

File metadata and controls

279 lines (225 loc) · 8.68 KB

Logging & Monitoring

Dshackle can log all requests to a file in JSON format. Or JSON Lines to be more precise, i.e., a test file where each line is a JSON.

It includes:

  • Access Log - requests made to Dshackle from a client

  • Request Log - requests made by Dshackle to an upstream

Because some requests may produce multiple responses, or multiple internal requests the log is not related as one-to-one. But instead maintains a reference a request.id in both logs, that can be used to join multiple access and request logs into a single trace.

Access Log

Note
By default, the Access log is disabled.

To enable access log add following configuration:

access-log:
  enabled: true
  filename: /var/log/dshackle/access_log.jsonl

filename is optional, and the default value is access_log.jsonl (i.e., in the current directory).

Or if you use a centralized log processor listening a socket:

access-log:
  enabled: true
  type: socket
  host: 127.0.0.1
  port: 9000

In this case all logs sent to the 127.0.0.1:9000 socket via TCP, and each line is prefixed with a 32 bit length.

Since a single request may contain multiple replies (ex., a batch call, or subscribe to the head blocks) the Dshackle logging is based on replies. The access log file contains details per each response send from the server, and each of them refers to original request details.

The log contains the JSON lines similar to:

{
  "version":"accesslog/v1beta2",
  "id":"578d83db-cf53-4ef8-b73e-3f1cc0a67e96",
  "channel":"DSHACKLE",
  "ts":"2021-07-20T01:53:33.174645Z",
  "method":"NativeCall",
  "blockchain":"ETHEREUM",
  "total":2,
  "index":0,
  "succeed":true,
  "request":{
    "id":"513b9b49-b472-4c83-b4b7-58dd2aabe9f6",
    "start":"2021-07-20T01:53:33.086946Z",
    "remote":{
      "ips":["127.0.0.1", "10.0.5.102", "172.217.8.78"],
      "ip":"172.217.8.78",
      "userAgent":"grpc-node-js/1.1.8"
    }
  },
  "nativeCall":{
     "method":"eth_blockNumber",
     "id":2,
     "payloadSizeBytes":2
  }
}
Where:
  • version version of the JSON format (note it may change until v1, i.e., a version with alpha, beta, etc., is not a final format)

  • id uniq id of the reply

  • channel access channel (DSHACKLE for native Dshackle calls, JSONRPC for JSON RPC HTTP Proxy)

  • ts timestamp of the reply

  • method Dshackle method which was called (i.e., not a Blockchain API method, see nativeCall details)

  • blockchain blockchain code

  • total how many requests in the batch (available only for a NativeCall call)

  • index current index (i.e. count) of the reply to the original request

  • succeed if call succeeded, in terms of Blockchain API

  • request original request details

    • id uniq id of the request; all replied to the same request have same id

    • start when request was received

    • remote remote details

      • ips list of all recognized IPs (including headers such as X-Real-IP and X-Forwarded-For)

      • ip a single ip, that likely represent a real IP of the remote

      • userAgent user agent

  • nativeCall details of the individual Native Call request

    • method method name terms of Blockchain API

    • id request id provided in the original request

    • payloadSizeBytes size of the original individual request (for JSON RPC it’s size of the params value)

Request Log

Note
By default, the request log is disabled.

To enable request log add following configuration:

request-log:
  enabled: true
  filename: /var/log/dshackle/request_log.jsonl

Could use the same config as Access Log for sockets:

request-log:
  enabled: true
  type: socket
  host: 127.0.0.1
  port: 9000

The log contains the JSON lines similar to:

{
  "version":"requestlog/v1alpha",
  "id":"c23a5391-1b8c-40b0-a32f-fa736b8175a2",
  "success":true,
  "upstream":{
    "id":"sepolia",
    "channel":"WSJSONRPC",
    "type":"JSONRPC"
  },
  "request":{
    "source":"INTERNAL",
    "id":"f2ae5096-b7d0-4006-99c5-0108bd295df4",
    "start":"2022-11-09T01:13:25.652663Z"
  },
  "jsonrpc":{
    "method":"eth_getBlockByHash",
    "id":218
  },
  "blockchain":"TESTNET_SEPOLIA",
  "execute":"2022-11-09T01:13:25.540592Z",
  "complete":"2022-11-09T01:13:25.652683Z",
  "responseSize":6243,
  "queueTime":0,
  "requestTime":112
}
Where:
  • version version of the JSON format (note it may change until v1, i.e., a version with alpha, beta, etc., is not a final format)

  • id uniq id of the request

  • success if it returns a result or an error

  • upstream upstream reference

    • id id as defined in upstreams config

    • channel type of the connection (WSJSONRPC for a Websocket JSON RPC, JSONRPC for HTTP JSON RPC, DSHACKLE for Dshackle protocol)

    • type type of the request (DSHACKLE for Dshackle gRPC request, JSONRPC as JSON RPC, and WSSUBSCRIBE Websocket subscription request)

  • request reference to original request

    • source is REQUEST when is caused by external call, or INTERNAL if it’s an internal request such as a health check

    • id for a REQUEST source it’s the same id as in access log in request.id field. For an INTERNAL source it’s just an internal id.

    • start time when the request was initiated (note that the execution time may differ, see below)

  • jsonrpc details about the JSON RPC request (optional)

  • blockchain blockchain associated with the request (optional)

  • execute time when the request started to execute. Some request may consist of multiple additional requests, or a request may wait until other requests are returned. So it may be after the request.start time.

  • complete time when the response was received from upstream.

  • responseSize response payload size in bytes (payload here means it’s not the whole json but only the result part).

  • queueTime - the time between start and execute moments, in milliseconds

  • requestTime - the time between execute and complete moments, in milliseconds

Prometheus Metrics

By default, Dshackle provides Prometheus metrics on http://127.0.0.1:8081/metrics.

To configure the metrics use:

monitoring:
  enabled: true
  jvm: false
  extended: false
  prometheus:
    enabled: true
    bind: 192.168.0.1
    port: 8000
    path: /status/prometheus

Where jvm options enabled monitoring of the JVM internals, such as memory, GC, threads, etc. And extended enables additional metrics for query selectors, etc.

Grafana Dashboard

Simple Grafana dashboard available here

This dashboard contains:

  • Upstreams Availability

  • Upstreams Lag

  • JSON RPC total request / failed requests

  • GRPC total request / failed requests

  • JSON RPC Response time

  • Upstreams Errors

  • JSON RPC upstream conn seconds 50,75,90,99 percentiles

Health Checks

Dshackle provides a http endpoint to check status of the servers. This check is compatible with Kubernetes Liveness and Readiness Probes.

By default, it’s disabled, and you have to set up which blockchain are required to be available to consider Dshackle alive.

Example config:
health:
  port: 8082 # (1)
  host: 127.0.0.1 # (2)
  path: /health # (3)
  blockchains: # (4)
    - chain: ethereum # (5)
      min-available: 2 # (6)
    - chain: bitcoin
      min-available: 1
  1. (optional) port to bind the Health server. Default: 8082

  2. (optional) host to bind the Health server. Default: 127.0.0.1

  3. (optional) path on the server. Default: /health. I.e., http://127.0.0.1:8082/health with default config

  4. list of blockchain to check availability

  5. a Blockchain to check

  6. minimum available (i.e., fully synced) Upstreams for that blockchain

With the config above the server is considered healthy if:

  • Dshackle has connected to at least two valid Ethereum upstreams

  • and at least one valid Bitcoin upstream.

When the server is healthy is responds with OK and 200 as HTTP Status Code. When any of the checks failed, it responds with a short description and 503 as HTTP Status Code.

Example of a response for an unhealthy server that doesn’t have enough upstreams for a Ethereum Classic Blockchain.

ETHEREUM_CLASSIC UNAVAILABLE

Optionally, the server can be called with ?detailed query, which provides a more detailed response:

ETHEREUM_CLASSIC UNAVAILABLE
BITCOIN AVAILABLE
  local-btc-1 OK with lag=0
ETHEREUM AVAILABLE
  local-eth-1 OK with lag=0
  local-eth-2 OK with lag=0