Обучение
September 17, 2022

HAQQ node monitoring solution for Zabbix

A complete log file based Cosmos gaiad monitoring solution for HAQQ NETWORK. It consists of the shell script nodemonitor.sh for generating log files on the host and the template zbx__template_nodemonitorgaiad.xml for the Zabbix server, either version 4.x or 5.x.

Monitoring in Zabbix server example:

Telegram alert example:

Qick start

1) Install Zabbix Server 5 LTS

2) Install template to your zabbix server.

On the host side the Zabbix agent needs to be installed and configured for active mode (is not default).

3) Install script nodemonitor.sh to your server with node.

wget -qO /home/nodemonitor.sh https://raw.githubusercontent.com/starnodes/nodemonitorgaiad/master/nodemonitor.sh
chmod +x /home/nodemonitor.sh

4) Change patch in script nodemonitor.sh CONFIG="$HOME/.node/config/config.toml"
EXAMPLE: CONFIG="$HOME/.haqqd/config/config.toml"

nano /home/nodemonitor.sh

5) Run script:

rm -f /home/nodemonitor-root*
screen -d -m -S nodemonitor /home/nodemonitor.sh

6) Stop script:

screen -S nodemonitor -X quit

7 )Change HOST macros to ur location log /home/nodemonitor-root.log

Concept

nodemonitor.sh generates logs that look like:

2020-04-02 01:15:24+00:00 status=synced blockheight=1557201 tfromnow=10 npeers=13 npersistentpeersoff=0 isvalidator=yes pctprecommits=.95 pcttotcommits=.99
2020-04-02 01:15:54+00:00 status=synced blockheight=1557207 tfromnow=7 npeers=12 npersistentpeersoff=1 isvalidator=yes pctprecommits=1.00 pcttotcommits=1.0
2020-04-02 01:16:25+00:00 status=synced blockheight=1557212 tfromnow=9 npeers=13 npersistentpeersoff=0 isvalidator=yes pctprecommits=1.00 pcttotcommits=1.0

or the Zabbix server there is a log module for analyzing log data.

The log line entries are:

  • status can be {scriptstarted | error | catchingup | synced} 'error' can have various causes, typically the gaiad process is down
  • blockheight blockheight from lcd call
  • tfromnow time in seconds since blockheight
  • npeers number of connected peers
  • npersistentpeersoff number of disconnected persistent peers
  • isvalidator if validator metrics are enabled, can be {yes | no}
  • pctprecommits if validator metrics are enabled, percentage of last n precommits from blockheight as configured in nodemonitor.sh
  • pcttotcommits if validator metrics are enabled, percentage of total commits of the validator set at blockheight

Note

For monitoring multiple gaiad instances on the same host the Cosmos Gaiad template needs to be cloned in the template section of the server by making use of the clone function.

@starnodes_ru - наш канал в Telegram с новостями и гайдами по тестнетам.

@starnodes_chat - канал, куда можно обратиться за помощью.