Auto healing monitoring alert system : part 02

Sumudu Nissanka
4 min readJan 4, 2019

(My Experience with Icinga2 and Capistrano3)

Hi my friends,

Here I bring a new tutorial for my article series. Today I’m going to guide you to complete the main steps of our system. I recommend you to use a vagrant box to build up a sample system by yourselves. In here I installed icinga2 and Capistrano in two instances.

Step 01

Create master agent system using icinga2. Are you new to icinga2? Then follow their documentation. It will help you to build icinga server. In here I used two clients for monitoring using icinga2 master. I named clients as client1 and client02.

Step 02

Set up your Capistrano server. If you are newbie to capistrano use this step to set up your Capistrano server.

  • Install capistrano server using gem install capistrano
  • Create the capistrano project using cap install command (this command for capistrano 03 version)

Step 03

Create Capistrano file. In here all coding are done with ruby language. Below show you folder structure. autoheal.rb is the ruby file that we are going to implement to the work.

terminal out put for cap project

Implementing the main code for autoheal.rb and review.

require 'net/http'
require 'uri'
require 'openssl'
require 'json'
require 'pp'
require 'net/smtp'
require 'thread'
# (1 function) sending notificaion via emails
def sendemail(service , host , name)

#call icinga2 API for getting service details after apply the soluation
uri = URI.parse("https://192.168.30.44:5665/v1/objects/services?service=#{name}")
request = Net::HTTP::Get.new(uri)
request.basic_auth("root", "icinga") #here is the icinga credintials for getting API service
req_options = {
use_ssl: uri.scheme == "https",
verify_mode: OpenSSL::SSL::VERIFY_NONE,
}
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
res = JSON.parse(response.body)
status = res["results"][0]["attrs"]["last_check_result"]["exit_status"]#put emails to from and to positions and give the credintials to the gmail servermessage = <<MESSAGE_END
From: from@gmail.com
To: toemail@gmail.com
Subject: Icinga 2 notifications
This is a test e-mail message for ruby
result exit status : #{status}
service : #{service},
host address : #{host} .
MESSAGE_END
smtp = Net::SMTP.new 'smtp.gmail.com', 587
smtp.enable_starttls
smtp.start( 'gmail.com','username@gmail.com', 'password', :login) do |smtp|
smtp.send_message message, 'from@gmail.com', 'to@gmail.com' end #add relavant email address to from mail and to mail
end#2 function
#select solution to the given services this function based on nrpe check command.
def selectservice(service, hostip , servicename )if service == "check_apt"server "#{hostip}", roles: [:web]
Rake::Task['captest:checkip'].invoke()
Rake::Task['captest:checkip'].reenable
sleep(60)
sendemail(service, hostip, servicename)
elsif service == "check_load"server "#{hostip}", roles: [:caphost]
Rake::Task['captest2:hostname'].reenable
Rake::Task['captest2:hostname'].invoke()
sleep(60)
sendemail(service, hostip, servicename)
elsif service == "check_disk"server "#{hostip}", roles: [:cdisk]
Rake::Task['capdisk:cleandisk'].reenable
Rake::Task['capdisk:cleandisk'].invoke()
# Thread.new do
sleep(60)
sendemail(service, hostip, servicename)
# }
# end
else
puts service
end
end
#3 function
#select solution to the given services if it is not nrpe command
def withoutnrpeservice(service, hostip , servicename )if service == "apt"
# in here put namespace and task name in below format.
# Rake::Task[namespace:taskname]
server "#{hostip}", roles: [:web]
Rake::Task['captest:checkip'].invoke()
Rake::Task['captest:checkip'].reenable
sleep(60)
sendemail(service, hostip, servicename)
elsif service == "http"server "#{hostip}", roles: [:caphost]
Rake::Task['captest2:hostname'].reenable
Rake::Task['captest2:hostname'].invoke()
sleep(60)
sendemail(service, hostip, servicename)
elsif service == "check_disk"server "#{hostip}",port: 22, roles: [:cdisk]
Rake::Task['capdisk:cleandisk'].reenable
Rake::Task['capdisk:cleandisk'].invoke()
# Thread.new do
sleep(60)
puts "tread done n dust"
sendemail(service, hostip, servicename)
# }
# end
puts service
puts hostip
else
puts service
end
end
#**************************************main code*****************************************************************#call icinga2 API for getting host name and ip address change icinga ip address acording to icinga master
uri = URI.parse("https://192.168.30.44:5665/v1/objects/hosts")
request = Net::HTTP::Get.new(uri)
request.basic_auth("root", "icinga")
req_options = {
use_ssl: uri.scheme == "https",
verify_mode: OpenSSL::SSL::VERIFY_NONE,
}
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
res = JSON.parse(response.body)
#create a dictionary called hosts for store host name and ip address$hosts={}res["results"].each do |h|
$hosts[h["attrs"]["__name"]]= h["attrs"]["address"]
end
#get services from icinga2 API in here change ip address according to icinga server ipuri = URI.parse("https://192.168.30.44:5665/v1/objects/services?filter=service.state!=ServiceOK&pretty=1")
request = Net::HTTP::Get.new(uri)
request.basic_auth("root", "icinga")
req_options = {
use_ssl: uri.scheme == "https",
verify_mode: OpenSSL::SSL::VERIFY_NONE,
}
response = Net::HTTP.start(uri.hostname, uri.port, req_options) do |http|
http.request(request)
end
res = JSON.parse(response.body)res["results"].each do |x|
$stdout.print x["attrs"]["__name"]
$stdout.print " "
#$stdout.print x["attrs"]["host_name"]
$stdout.print " "
$stdout.print x["attrs"]["last_check_result"]["output"]
$stdout.print "\n"
#get service name with host
$servicename = x["attrs"]["__name"]
#get host name from json result
$hostname = x["attrs"]["host_name"]
#find ip address from hosts dictionary
$hostip = $hosts[$hostname]
# get the exit status for y variable
y = x["attrs"]["last_check_result"]["exit_status"]
# check whether it is in OK,WARNING,CRITICAL
# using nrpe command get the service
# check whether it is related to the nrpe or other
$command = x["attrs"]["check_command"]
begin
case y
when 0
$stdout.print "OK\n"
#check for warning status
when 1
if $command == "nrpe"
selectservice(x["attrs"]["vars"]["nrpe_command"], $hostip , $servicename )
else
withoutnrpeservice($command, $hostip , $servicename )
end
#check for critical status if this wants then uncomment this part
when 2
#this for countinue the loop when facing the critical situation.
next
# if $command == "nrpe"
# selectservice(x["attrs"]["vars"]["nrpe_command"], $hostip , $hostname , $name )
# else
# withoutnrpeservice($command, $hostip , $hostname,$name )
# end
else
raise Exception, "No any status with ...."
end
rescue IOError => msg
puts msg
end
end

After creating the main ruby file you need to create task file for including solution to each service.

When we are going to implement task file I named it as clean_disk because this task file regard to check_disk service.

folder structure with task file

Keeping in touch with my next article, I will explain this autoheal.rb file for getting an idea to you. Thank you and good luck all :)

For part 01 click here

--

--

Sumudu Nissanka

Software Engineer @wso2 | Graduate @University of Colombo School of Computing | Former DevOps intern @wso2