Nagios. So familiar. I feel like I’ve run Nagios at every job I have ever had.
Talk to most ops people, even at really big places, and they will probably admit to using it.
Let’s try it out, but gosh, I am SO lazy. I cannot be bothered to read the installation instructions. All I want to do is install the puppet module, add a couple of lines to my manifest, and let puppet do the rest. Then I can run puppet agent in debug mode so when my boss comes by it looks like I’m REALLY busy.
Step 1: Game plan
I’ve got a test server I know I want to be my sensu server. I know I’m going to have enable the sensu client run on the servers I want monitored. Here are my goals:
- Have sensu-server configured on my server (call it
- Have sensu-client configured on my client (call it
- I want a dashboard
- I want a an email alert
- I don’t want to have to ssh to my clients to do anything. (I have puppet to do that for me, duh.)
Step 2: Puppet Module
My puppet master is not
mon1, but it doesn’t matter. I run on the puppetmaster
1 2 3
Ok, good start. So… the “For Real” part in the blog post title is about those other things that most howto’s don’t mention. Unless you already have RabbitMQ and Redis installed, you will need those modules. Don’t know how to run Redis or configure RabbitMQ? It’s ok, neither do I.
Step 2A: SSL Certs
Yea, I know what you are thinking. Kyle, I already have SSL certs for my infrastructure, do I have make another set? Yes. I think so. I’m not smart enough to use existing certs.
Joe Miller has made a pretty easy script to generate some. For RabbitMQ you can basically use a single client and server key and let puppet distribute them:
1 2 3 4 5
You can see that I just stick all the files in my “files/sensu” directory for puppet to distribute for me.
Step 2B: Puppet config
Here is the configuration I needed to get a full system running:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34
Take note that the Sensu module lets you stick in a puppet:/// url for the certs, but the RabbitMQ module does not. Distributing them using the “file” directive is pretty easy though.
I personally believe that purge_config should default to true. We are using puppet here. If you are hand placing json, you are doing it wrong.
Step 3: Clients
With your SSL certs in place, adding clients is pretty easy:
1 2 3 4 5 6 7 8 9 10
Not too bad. Notice that there is nothing server-side to generate the config for this host.
After your puppet runs converge, you should be able to access the Sensu dashboard. By default it is on the sensu server, in this example it would be http://sensu:secret@mon1:8080.
If all of this is working, you should see client1 in the clients list.
Step 4: Handlers
Sensu handlers are scripts that are called with event data. For getting started I use the simplest example:
1 2 3
You are going to get json in your body, but we can make it pretty later.
Step 5A: Your first client-side check
This type of check is what you might consider an NRPE check, it runs on the client:
1 2 3 4 5 6 7 8
Run puppet, stop cron, you should get an email.
Step 5B: Your first server-side check
Sometimes you need to have the servers do the checking. Not everything can be a client-side check. Sometimes you really do want your monitor server to be able to ping your clients (or check http, etc).
1 2 3 4 5 6 7 8 9 10 11 12
In this case, the @@ in front of the sensu check tells puppet to not actually make it, just store it. Then the «||» operator on the server side will take those stored configs, and make them.
Sensu is still new, but it shows a lot of promise. It is built from the ground up to be configured by machines, not by humans. It is also designed to scale, allowing you to grow your RabbitMQ cluster and your Sensu-servers at will.
Absent from Sensu (at the time of this writing) is the infrastructure for complicated time periods, escalations, etc. Maybe it is better that way? It does feel a little more unixy, with each individual Sunsu piece handling a very particular function.
Not mentioned in this post is how to manage subscriptions, making new handlers, adding mutators, supplementing the checks with metrics and having Sensu handle them by shipping them off to a metric system, sensu-admin, having Sensu automatically detect downed AWS nodes and not alert on them, etc.
In the brave new elastic-compute-config-management-controlled world, Sensu looks like a lot better option than Nagios in my opinion.