Aug 252014
 
Article Server Administration

The earlier post “Monitoring Linux systems with Munin” in this same blog already gave an overview of the functionality available in the system monitoring tool.

This post gives some more details on how to properly set up alerts triggered by the different Munin plugins when the system parameters being monitored reach the configured thresholds.

1. Defining the contact email address

The command that Munin will run to send alert emails is specified in the main configuration file “/etc/munin/munin.conf”. The recipient of the emails will usually be specified as part of that command.

Example:

2. Find out the names of the parameters retrieved by plugins.

The munin-node client executes the active plugins each time the munin servers requests an update.

Each plugin retrieves the value of one or several parameters, and sends it to the munin node, which in turn sends the information collected to the munin server.

The easiest way to find out the names of the parameters retrieved by some plugin, is running the plugin interactively from the command line, with the “munin-run” utility.

For instance, to find out the parameters monitored by the “vmstat” plugin:

The output from munin-run shows that vmstat retrieves the values of a parameter named “wait”, and another parameter named “sleep”.

Additional information can be obtained about these parameters adding the “config” option to the munin-run command:

From the output obtained with the “config” option, we can see that the “wait” parameter retrieves the number of processes waiting for execution (“wait.label running”), and the “sleep” parameter retrieves the number of processes waiting for and I/O operation to complete.

3. Defining alerts

Two thresholds can be defined for each of the parameters retrieved by a plugin:

  • A “warning” level. If the value of the parameter falls outside of the range defined for this level, a warning message is sent to the configured recipient.
  • A “critical” level.  If the value of the parameter falls outside of the range defined for this level, a critical alert message is sent to the configured recipient.

These thresholds are defined in the munin.conf configuration file, inside sections specific for each node. Therefore, the thresholds specified for a given parameter can be different for each node.

For instance, to generate a warning if there are more than ten processes in the run queue in the local node, and a critical alert if there are more than 50 processes in that queue, the [localhost.localdomain] section in munin.conf needs to be edited, adding those thresholds to the “wait” parameter monitored by the “vmstat” plugin:

References

 Posted by at 10:47 am

 Leave a Reply

(required)

(required)