Nagios + DNX
From Nagios Wiki
Contents |
[edit] Why Use DNX?
Nagios uses active and passive checks. When you have tens of thousands of active checks or more, you have to distribute these checks across several servers, and the configuration in doing so can be rather cumbersome.
DNX addresses this issue by letting you configure one main Nagios machine and as many worker Nagios boxes as needed to help you easily distribute these active checks.
[edit] Installing Nagios
There are a million guides on how to set up and configure Nagios itself, so this quick how-to will not touch upon that, although it should be noted that this setup was tested using 32-bit CentOS 5.2 with Nagios 3.0.3. I will also provide a sample rpm spec file in the files section so that you may build your own rpm, or you can see the configure options if you are on another OS.
[edit] Installing DNX
You can find the rpm spec file for DNX 0.18 in the files section of this wiki page. The prerequisites are listed, so it should not be too much trouble to build and install DNX.
[edit] Configuring DNX Server
The DNX server must be the same server as your Nagios server. You should read through the documentation on how to best configure DNX for your environment, but in its simplest form, your configuration file found in /etc/dnx/dnxServer.cfg will look something like this:
authWorkerNodes = 192.168.100.45,192.168.110.45 syncScript = /usr/lib/nagios/plugins/sync_plugins.pl -h 192.168.100.45,192.168.110.45
There are more options to play around with, but these are the only two lines you must edit in your config file. The first line is to tell DNX where its worker nodes are located and the second line is to sync the plugins from your DNX-server to the DNX workers. This script is assuming that you have setup ssh keys between the server and the workers, so make sure to create those. As you can see there are two worker nodes in this configuration, both on different subnets. I found that one dnx worker per collocation was plenty for our needs, but your mileage may vary depending on hardware.
[edit] Configuring Nagios for DNX
Once Nagios is installed you will need to edit /etc/nagios/nagios.cfg (assuming this is where your config files are).
To enable DNX insert/edit the following lines:
event_broker_options=-1 broker_module=/usr/lib/nagios/brokers/dnxServer.so
This will have Nagios send all of its checks to your DNX worker nodes.
[edit] Installing DNX Worker Nodes
The rpm spec file provided in the files section will also create rpms for the worker nodes. Once you have that it should be as simple as rpm -i.
[edit] Configuring DNX Worker Nodes
To configure DNX Worker nodes, edit /etc/dnx/dnxClient.cfg
channelDispatcher = udp://192.168.100.105:12480 channelCollector = udp://192.168.100.105:12481
Again, there are more options than this to play with, but this is the basics on what you need to get DNX workers communicating with the DNX server. This is assuming your Nagios+DNX server resides at 192.168.100.105.
After this you should start dnx with the init script which is included with the rpm.
[edit] Finishing The Setup
If everything was configured correctly you should be able to start Nagios using the init script provided with the rpm which will load the DNX module and you will enjoy the benefits of distributed checks.
[edit] Confirming Your Success
To confirm that DNX was loaded successfully by Nagios you can tail the log file located at /var/log/nagios/nagios.log. If everything worked as it should, you should see something along the lines of:
[1226122327] Nagios 3.0.3 starting... (PID=10724) [1226122327] Local time is Sat Nov 08 05:32:07 UTC 2008 [1226122327] LOG VERSION: 2.0 [1226122327] Event broker module '/usr/lib/nagios/brokers/dnxServer.so' initialized successfully.
Also, if it doesn't work your checks will fail, which is a pretty good indicator that you did something wrong.
[edit] Gotchas
I had severe issues with running Nagios + DNX on CentOS x86_64 - I finally had to roll back to 32bit, which to be quite honest is perfectly fine with me.
Also keep in mind that your plugins should be on every single server, DNX server and workers included.
[edit] Files Section
The wiki wouldn't allow me to upload non image files, so here are links to the source RPMs:

