Performance Monitoring
Monitoring TON server performance
Tools like htop
, iotop
, iftop
, dstat
, nmon
and others are good for measuring real-time performance, but they lack functionalities when it comes to troubleshooting past performance.
This guide recommends and explains how to use the Linux sar (System Activity Report) utility for TON server performance monitoring.
This guideline helps to identify if your server experiences any resource shortage, not if validator-engine performs badly.
Installation
SAR Installation
sudo apt-get install sysstat
Enable automatic statistics gathering
sudo sed -i 's/false/true/g' /etc/default/sysstat
Enable the service
sudo systemctl enable sysstat sysstat-collect.timer sysstat-summary.timer
Start the service
sudo systemctl start sysstat sysstat-collect.timer sysstat-summary.timer
Usage
By default sar gathers statistics every 10 minutes and shows statistics for the current day, starting at midnight. You can check it by running sar without parameters:
sar
If you want to see statistics of the previous day or two days before, pass the number as an option:
sar -1 # previous day
sar -2 # two days ago
For the exact date, you should use the f option to point to the sa file of a given day within a month. Thus, for the September 23rd it would be:
sar -f /var/log/sysstat/sa23
What sar reports to run and how to read them in order to identify performance issues?
Below is the list of sar commands that can be used to gather different system statistics. You can supplement them with the above options to quickly get the reports for the required date.
Memory report
sar -rh
Since the TON validator-engine utilizes jemalloc feature it caches a lot of data, this is the reason why sar -rh command most of the time returns a low number in column %memused
.
At the same time, there always be a high number in the column kbcached
. For the same reason, you should not worry about the low number of free RAM shown in column kbmemfree
. The important indicator however is the number that comes from the %memused
column.
If it goes above 90% you should consider adding more RAM and keep an eye if your validator engine is not stopping abnormally due to OOM (out of memory) reason - the best way to check that is to grep /var/ton-work/log
file for Signal messages.
Swap usage
sar -Sh
If you notice that swap is used you should consider adding more RAM. The general recommendation from the TON Core team is to have swap disabled.
CPU report
sar -u
If your server utilizes CPU on average up to 70% (see '%user' column), this should be considered as good.
Disk Usage report
sar -dh