onsdag 15. februar 2017

Decoupled Puppet Monitoring Module With Automated Discovery

Deploying monitoring with Puppet can be a tedious task. A very common approach to write a very opinionated module which hard code which services, ports and metrics to collect. Some write this in the service modules themselves. Like rolling out Sensu configuration in the Apache module. Please don't do this;

Using this pattern, you're essentially blocked from using upstream modules which doesn't do monitoring this way. You're also only able to monitor services deployed using Puppet

 Another approach is to assume that Apache should be installed, or separate the config in a subclass of the monitoring module, or manually include Apache monitoring stuff when Apache is getting installed

With this approach, you will only monitor services you manually specified - meaning you would have to manually input monitoring for the specific services required

But why hard code what services exist at any given host?

My solution to this problem is to use custom Puppet facts to discover if a certain service exist, or functionality is enabled.

This approach allows for Apache to be installed completely decoupled from the monitoring logic, meaning Apache can be deployed by an arbitrary logic - even manually. In the above example the fact will check if it can find the presence of an Apache status page, and if present - start collecting metrics by using CollectD.

Using the Puppet monitoring module, your node definition may now look like this:

There is in particular one challenge to this design, as Puppet facts are resolved prior to any manifest being applied. So in the given scenario where you want to both install and configure Apache to serve the server status page, and enable metrics collection from it you have a bit of a chicken and egg problem.

Initially I wanted to utilize Puppet stages. By default everything is happening in the "main" stage, but by adding a custom stage to happen after the main stage, you can deploy monitoring after the installation of the actual service.

The problem is that stages doesn't currently support refreshing facts or nested resources, so putting the monitoring module in a stage after "main" simply doesn't work.

Another issue is that facts are populated before any manifest code runs, so you'd need to refresh these facts somehow. Luckily Puppet comes with a very powerful plugin interface which allows us to play with Puppet internals though a custom library. This is represented then to the manifest as a function to reload facts, which lets us refresh the facts values during runtime! This logic has been implemented in the puppet-refacter module, which allows for a re-run of the manifest if any facts changed during the run.

The puppet-monitoring module approach to stages is to use a Puppet resource collector to ensure that the monitoring tools are applied last. This is done by injecting every other resource available as a requirement, which effectively applies the monitoring code last. Example:

This code will reload facts if any service matching the Apache regex change state, and ensure that the CollectD module will get applied afterwards.

The Puppet Monitoring module supports monitoring and metrics from a number of services already. Check it out on GitHub!

As an added bonus, feel free to use already configured dashboards for available CollectD metrics supported by this module at: https://grafana.net/jskarpet