Visualizing Node Metrics

Introduction

Recent versions of Substrate expose metrics, such as how many peers your node is connected to, how much memory your node is using, etc. To visualize these metrics, you can use tools like Prometheus and Grafana. In this tutorial you will learn how to use Grafana and Prometheus to scrape and visualize node metrics.

Note

In the past Substrate exposed a Grafana JSON endpoint directly. This has been replaced with a Prometheus metric endpoint.

A possible architecture could look like:

+-----------+                     +-------------+                                                              +---------+
| Substrate |                     | Prometheus  |                                                              | Grafana |
+-----------+                     +-------------+                                                              +---------+
      |               -----------------\ |                                                                          |
      |               | Every 1 minute |-|                                                                          |
      |               |----------------| |                                                                          |
      |                                  |                                                                          |
      |        GET current metric values |                                                                          |
      |<---------------------------------|                                                                          |
      |                                  |                                                                          |
      | `substrate_peers_count 5`        |                                                                          |
      |--------------------------------->|                                                                          |
      |                                  | --------------------------------------------------------------------\    |
      |                                  |-| Save metric value with corresponding time stamp in local database |    |
      |                                  | |-------------------------------------------------------------------|    |
      |                                  |                                         -------------------------------\ |
      |                                  |                                         | Every time user opens graphs |-|
      |                                  |                                         |------------------------------| |
      |                                  |                                                                          |
      |                                  |       GET values of metric `substrate_peers_count` from time-X to time-Y |
      |                                  |<-------------------------------------------------------------------------|
      |                                  |                                                                          |
      |                                  | `substrate_peers_count (1582023828, 5), (1582023847, 4) [...]`           |
      |                                  |------------------------------------------------------------------------->|
      |                                  |                                                                          |

Reproduce diagram

Go to: https://textart.io/sequence

object Substrate Prometheus Grafana
note left of Prometheus: Every 1 minute
Prometheus->Substrate: GET current metric values
Substrate->Prometheus: `substrate_peers_count 5`
note right of Prometheus: Save metric value with corresponding time stamp in local database
note left of Grafana: Every time user opens graphs
Grafana->Prometheus: GET values of metric `substrate_peers_count` from time-X to time-Y
Prometheus->Grafana: `substrate_peers_count (1582023828, 5), (1582023847, 4) [...]`

What you will be doing

1. Install Prometheus and Grafana

2. Configure Prometheus to scrape your Substrate node

3. Visualize Prometheus Metrics with Grafana

Learning outcomes

Learn how to do a time-series scrape for a Substrate node using Prometheus
Learn how to use Grafana and Prometheus to visualize node metrics

Install Prometheus and Grafana

Install Prometheus here.
Install Grafana here.

Advice

We suggest for testing that you download the compiled bin programs for these apps as opposed to fully installing them or using them in docker. Just download for your architecture, and run it from the working directory that is convenient for you. The preceding links provide instructions for downloading the compiled programs, and this guide assumes you follow those instructions.

Start a Substrate Template Node

Information

Before you continue here, you should complete the create your first substrate chain tutorial. The same substrate version, conventions for directory structure, and bin names are used here. You can of course use your own custom Substrate node instead of the template, just edit the commands shown as needed.

Substrate exposes an endpoint which serves metrics in the Prometheus exposition format available on port 9615. You can change the port with --prometheus-port <PORT> and enable it to be accessed over an interface other than local host with --prometheus-external.

# clear the dev database
./target/release/node-template purge-chain --dev -y
# start the template node  in dev & tmp mode to experiment
# optionally add the `--prometheus-port <PORT>`
# or `--prometheus-external` flags
./target/release/node-template --dev

Configure Prometheus to scrape your Substrate node

In the working directory where you installed Prometheus, you will find a prometheus.yml configuration file. Let's modify this (or create a custom new on) to configure Prometheus to scrape the exposed endpoint by adding it to the targets array. If you modify the default, here is what will be different:

# --snip--

# A scrape configuration containing exactly one endpoint to scrape:
# Here it's Prometheus itself.
scrape_configs:
  # The job name is added as a label `job=<job_name>` to any timeseries scraped from this config.
  - job_name: 'substrate_node'

    # metrics_path defaults to '/metrics'
    # scheme defaults to 'http'.

    # Override the global default and scrape targets from this job every 5 seconds.
    # ** NOTE: you want to have this *LESS THAN* the block time in order to ensure
    # ** that you have a data point for every block!
    scrape_interval: 5s

    static_configs:
      - targets: ['localhost:9615']

Note

You want to have scrape_interval less than the block time in order to ensure that you have a data point for every block!

Now we can start a Prometheus instance with the prometheus.yml config file. Presuming you downloaded the binary, cd into the install directory and run:

# specify a custom config file instead if you made one here:
./prometheus --config.file prometheus.yml

leave this process running.

Check All Prometheus Metrics

In a new terminal, we can do a quick status check on prometheus:

curl localhost:9615/metrics

Which should return a similar output to:

# HELP substrate_block_height Block height info of the chain
# TYPE substrate_block_height gauge
substrate_block_height{status="best"} 7
substrate_block_height{status="finalized"} 4
# HELP substrate_build_info A metric with a constant '1' value labeled by name, version
# TYPE substrate_build_info gauge
substrate_build_info{name="available-vacation-6791",version="2.0.0-4d97032-x86_64-linux-gnu"} 1
# HELP substrate_database_cache_bytes RocksDB cache size in bytes
# TYPE substrate_database_cache_bytes gauge
substrate_database_cache_bytes 0
# HELP substrate_finality_grandpa_precommits_total Total number of GRANDPA precommits cast locally.
# TYPE substrate_finality_grandpa_precommits_total counter
substrate_finality_grandpa_precommits_total 31
# HELP substrate_finality_grandpa_prevotes_total Total number of GRANDPA prevotes cast locally.
# TYPE substrate_finality_grandpa_prevotes_total counter
substrate_finality_grandpa_prevotes_total 31
#
# --snip--
#

Alternatively in a browser open that same URL (http://localhost:9615/metrics) to view all available metric data.

Tip

Here you can see the HELP fields for each metric that is exposed for monitoring via Grafana.

Visualizing Prometheus Metrics with Grafana

Once you have Grafana running, navigate to it in a browser (the default is https://localhost:3000/). Log in (default user admin and password admin) and navigate to the data sources page at localhost:3000/datasources.

You then need to select a Prometheus data source type and specify where Grafana needs to look for it.

Tip

The Prometheus port Grafana needs is NOT the one you set in the prometheus.yml file (https://localhost:9615) for where your node is publishing it's data.

With your substrate node and Prometheus are running, configure Grafana to look for Prometheus on it's default port: https://localhost:9090 (unless you customized it).

Hit Save & Test to ensure that you have the data source set correctly. Now you can configure a new dashboard!

Template Grafana Dashboard

If you would like a basic dashboard to start here is a template example that you can Import in Grafana to get basic information about your node:

Grafana Dashboard

If you create your own, the prometheus docs for Grafana may be helpful.

If you do create one, consider uploading it to the community list of dashboards and letting the Substrate builder community know it exists by listing in on Awesome Substrate! Here is ours on the Grafana public dashboard.

Next Steps

Continue learning more:

Learn how to set up a private Substrate network.
Further configuration, notification services, and permanent installation of Substrate/Polkadot monitoring tools.
Visit the source code for Substrate Prometheus Exporter.
See the docs for Prometheus in Substrate.

Have a look at some node visualization examples:

The Grafana dashboard configuration for the Polkadot network.
Grafana Template for a Substrate node template.