CPU Monitoring
Last modified on March 24, 2023
Once you’ve set up your relays and/or gateways, you may wish to set up automated monitoring. This page describes two major methods of verifying the health and capacity of your StrongDM relays and gateways.
Functionality/Liveness Check
The StrongDM binary includes a configurable ’liveness’ URL that you can use to verify that the relay/gateway is alive and functioning properly. To enable this URL:
- Docker: Add
-e SDM_ORCHESTRATOR_PROBES=:9090
to the invocation. 9090 is the default port; you can replace it with any port. - Kubernetes: Liveness check is already enabled in the Kubernetes configuration.
- Direct configuration: Add the
SDM_ORCHESTRATOR_PROBES
environment variable when starting the relay/gateway process, setting it to:9090
or whichever port you prefer.
Once configured, you can check http://ip-of-relay:9090/liveness
, replacing 9090
with the port you configured in the environment variable. If it returns HTTP status 200, then the relay/gateway is in good health.
Relay/Gateway Capacity
The StrongDM binary is carefully designed to use a relatively constant amount of RAM, so its memory utilization should not change significantly through the process lifecycle. Because of this, StrongDM recommends watching the CPU load of the underlying machine to assess the need for additional capacity.
Load Average
The StrongDM binary will use all available CPUs. If you note that more than 50% of your CPU cores are constantly saturated, then this is a good measure that it is time to scale up.
CPU Time of sdm Process
If you notice that the CPU time of the sdm
process is increasing faster than real time (for instance, if it uses 30 hours of CPU time in 15 hours of real time) then this is another indication that it is time to scale up capacity.