![]() |
![]() |
|
Newsletter | Forum | Contact Us | News | Products | Downloads | Support | Partners |
|
|
Service Level Agreement management with LUA scriptingIntroductionThe SLA management feature is available in the LoriotPro Extended Edition. With this set of capabilities LoriotPro can help you to check that the current quality of your information system (availability and performance counters) follow a predefined level of service, a SLA. Quality of service indicator is expressed in % of something (availability, performance) over a defined time period, like a day or a week. For example, 99 % of availability over a one week period means that a host was not reachable for a cumulated time of 100 seconds. 99% of performance over one week period means that only for 100 seconds the response time was higher than the predefined value set by the agreement. The SLA support offered by LoriotPro software provides you predefined SLA report. You can also define your own report and automate the report generation by the use of script written in LUA language embedded in the LoriotPro extended Edition. One of the goal of LoriotPro is to monitor the availability of the hosts defined in the directory. This task is performed by a LoriotPro embedded module called the poller process. The poller process sends packets (icmp Ping of snmp Ping) at regular intervals to each host that is monitored. If the host configuration allows SLA data collection, the results of this permanent polling are stored in a local database. This database contains two basic information: does the host answer to our requests and how long it takes to get these answers. These two data will allow us to check if the host availability and the performance follow a predefined quality of service, our Service level agreement. The SLA database is composed of simple daily text files that contain the returned values of our ping requests and for each of them the timestamp. Database file name are also time stamped to ease the analysis of data over a time period. The LUA scripting language and the SLA dedicated script functions will allow you to exploit the database and create your own SLA reports. Starting the SLA data collectionThe SLA data collection is started in the host configuration
screen.
SLA Database architectureIf you plan to use the data and the script function for SLA management you should understand the structure of the SLA database set by LoriotPro. LoriotPro use a directory structure to store the SLA database files. The SLA directory is located in the /bin directory of the LoriotPro directory installation path.
By default LoriotPro is installed in the : C:\Program Files\LUTEUS\LoriotPro V4\ And the SLA database files are stored in \bin\SLA by default : C:\Program Files\LUTEUS\LoriotPro V4\bin\SLA. The first sub directory level identifies the LoriotPro software that performed the data collection. Every installed LoriotPro software receives a unique ID. The underneath level contain sub directory named with the IP address of the hosts that have the SLA collection activated. The SLA activation is performed host by host in each host advanced parameters. By default the poller process of LoriotPro does not store SLA data. In each IP address named directory, LoriotPro create the database file, one for each day of data collection. These directories can also include other directories that are named with the port number of the application or a URL. This capability will be used in future development.
The file structures are always the same and allow a fast and simple data analysis. File do not contain host reference only data. SLA database file structureThe database file structure follows strict rules File name coding Database files use the current date as name, one file per day of the year. These files are text file encoded. Year_Month_Day.txt (Year 4 digits, Month 2 digits, Day 2 digits)
In the upper example, the SLA database file from the 22 of October 2005 for the host with IP address 123.1.1.1 for a icmp/snmp on the LoriotPro with ID 1007. SLA database files descriptionA file is composed of successive line; each line has the following format: timestamp;polling type;Response time or status information 1110629680;1;start timestamp is the time of the sample (number of second elapse since 1970) Polling type defines what kinds of packet or request are used for the polling
If PING an SNMP are used for polling both 1 and 2 types may appear in the file. Response time or status information If the host polled answers, the response time in milliseconds is stored. Else a status message can be also present.
How to estimate the polling interval 1129991441;1;start In the upper example, the SLA collection start in type 1 and 2 (PING icmp and PING snmp) and the polling interval is : 1129991466 - 1129991450 = 16 seconds In this case we should take the time stamp of the line 5 and line 4 which are good polling entries, to be able to find the polling interval. The time stamp of status information like start and stop is not linked to the polling period and cannot be used to calculate the polling interval. In the next example below the host stop to respond and the snmp polling thus the poller process switch to icmp polling (double polling activated in the host properties). The double polling is one of the poller features. The polling can use either snmp ping or icmp ping. If both polling methods are turn on only the snmp polling is used but if the host fails to answer to the snmp polling then the icmp polling is used. A host snmp agent can be stopped thus snmp request are not longer satisfied but the host is still working and available. The host is answering to snmp ping request, polling type is 2. 1110103926;1;start The host stops to answer to snmp polling (2) but still answers to ping polling (1). If the administrator stops the SLA collection for this host, stop status message is recorded. 1129991866;2;-1 Abnormal collection, holes in the data collection If there are missing records at some time intervals it is probably a process crash. Restart the SLA for this host 1110103926;1;start In the example the SLA start at 1110103926 The polling interval is 1110103960 - 1110103943 = 17 seconds but we can see with the next value here that the polling interval is moving. LoriotPro is may be unable to perform the polling due to an overload. Abnormal termination of LoriotProIf LoriotPro stops due to a system crash, the database file is not close correctly. 1110103926;1;start In the upper example there is a hole of collection between 1110105908 (start) and 1110104432. The polling process can be stopped
In this case the start_polling status information is used 1129993499;2;-1 The double polling can be stopped in the host properties Example with on polling type stopped 1129993675;1;-1 SummaryThe SLA database files structure is more complex when the double polling (ping and SNMP) is turn on. In the following example, various action performed in LoriotPro stop and start the SLA data collection. 1130404775;1;start Remark: After each start the provided value is not usable because the interruption time is totally random. Using LUA script function to exploit SLA database IntroductionThe LoriotPro embedded LUA script language can be use to exploit the data store in the SLA database. Among the function provided, you will find: A function that lists the LoriotPro software that are collecting SLA data. A function that list the host that have SLA data A function that compute the SLA indicator over a time period The functions are store in the lpsla library that should be included in all LUA script. V400 b138 SP0-cf 31 mai 2006
: The lua_lp_sla.dll file is the library and should be added in each LUA script file that should use the SLA functions. if (lp.IsDebugMode()==1)
then List of SLA function available in the LUA script language number=lpsla.GetLoriotProIDList('array'); This function get the list of the LoriotPro software involved in SLA data collection. Each LoriotPro software is identified by its unique ID 'array' an array of the available LoriotPro software ID. number : The number of directory available array[0] .. array[number-1] number=lpsla.GetSLAList('LoriotProID','array'); This function provide the list of Host by IP address that have SLA data available for a specific LoritoPro software ID. 'LoriotProID' : An ID (The ID is define in the license information file /bin/licence.ini) 'array' : An array with the list of host IP address available for this LoriotPro software ID. number : The number of available hosts with the SLA feature activated. array[0] . array[number-1] The list itself value=lpsla.Compute('id','sla_rep',Syear,Smonth,Sday,Eyear,Emonth,Eday,STime,ETime,RTT_Threshold,Avaibility,Performance,'array') This function calculates the current level over a time period. The return value is a table with the calculated values.
'id' : a LoriotProID 'sla_rep' : the directory where database file are store (ID/SLA) Syear : The starting year Smonth : The starting Month(1 - 12) Sday: The starting day1 - 31) Eyear : The ending year Emonth : The Ending Month Eday : The Ending day STime : The timestamp on the beginning time (OS binary format: os.time{year=2006,month=5,day=30,hour=0}) ETime : The timestamp on the ending time(OS binary format : os.time{year=2006,month=6,day=30,hour=0} RTT_Threshold : The limit defined by the agreement for the response time Avaibility : The rate limit defined for the availability Performance : The performance limit expectation 'array' : An array of calculated data The array has the following structure
Example of LUA script that performs the SLA calculation ////////////////////// sample if (lp.IsDebugMode()==1) then lib,init=lp.LoadLibrary(lp.GetPath().."/lua_lp_slad.dll","libinit"); else lib,init=lp.LoadLibrary(lp.GetPath().."/lua_lp_sla.dll","libinit"); end if (lib) then init(); id="1002"; k=lpsla.GetLoriotProIDList("a"); for l=0,k-1 do lp.Print(a[l]," LoriotPro ID \n"); i=lpsla.GetSLAList(a[l],"aa"); if i then for j=0,i-1 do lp.Print("\t",aa[j]," SLA \n"); --Compute('id' ,'sla_rep', Syear,Smonth, Sday, Eyear, Emonth, Eday, STime, ETime, RTT_Threshold, Avaibility, Performance, 'array') if lpsla.Compute(a[l],aa[j],2005,5,1,2006,6,30,os.time{year=2005,month=5,day=1,hour=0},os.time{year=2006,month=6,day=30,hour=0},50,90,90,'array') then lp.Print("\t\tip : ",array.ip,"\n"); lp.Print("\t\tname : ",array.name,"\n"); lp.Print("\t\tpolling_type : ",array.polling_type,"%\n"); lp.Print("\t\tperiode : ",array.periode,"%\n"); lp.Print("\t\tavaibility : ",array.avaibility,"%\n"); lp.Print("\t\tperformance : ",array.performance,"%\n"); lp.Print("\t\tgood_polling : ",array.good_polling,"\n"); lp.Print("\t\ttotal_collected : ",array.total_collected,"\n"); lp.Print("\t\ttotal_waited : ",array.total_waited,"\n"); end end end end end To run this script on a set of hosts, use the Host Bulk configuration Plugin.
In the following example, 3 hosts are not set for SLA collection. The host 127.0.0.1 is correctly set up but the number of expected samples (total_waited) is superior to the effective sample collected. However there was no loss in the collection, but this can be explain by a change in the polling interval (decrease) over this time range. Script used -- Display SLA for DAY -- -- To run correctly this file is located to bin/config/script -- Input values -- lp_index index for this script ".1" -- lp_oid SNMP OID for this script "ifnumber" -- lp_host default ip address for this script "127.0.0.1" -- Output Values lp_value = 0; lp_buffer ="error"; -- use this to initialise the host selection dofile(lp.GetPath().."/config/script/bulk/selection/LP_Selection.lua") dofile(lp.GetPath().."/config/script/lib-audit/1-audit.lua"); ----------------------------------------------------------------------------------------------- -- Start program ----------------------------------------------------------------------------------------------- --list the ip host to scan tabz={}; hostnumber=LP_HostsSelection(tabz); if hostnumber==0 then error("Not host selected\n") end if (lp.IsDebugMode()==1) then lib,init=lp.LoadLibrary(lp.GetPath().."/lua_lp_slad.dll","libinit"); else lib,init=lp.LoadLibrary(lp.GetPath().."/lua_lp_sla.dll","libinit"); end if (lib==nil) then error("SLA Lib Not found or not loaded\n") end; init(); lp.Print("Display SLA for day\n"); temp=os.date("*t",os.time()); --[[ temp.year temp.month temp.day temp.hour temp.min --]] lp.Print(string.format("\tyear %i month %i day %i\n",temp.year,temp.month,temp.day)); for i=0,table.getn(tabz) do info={}; rep=lp.GetIPInformation(tabz[i],"array"); if rep then if array.sla==1 then lp.Print(string.format("------------------------------------------\nHost %s\nIP add : [%s]\n\n",array.name,tabz[i])); if lpsla.Compute(100001,tabz[i],temp.year,temp.month,temp.day,temp.year,temp.month,temp.day ,os.time{year=temp.year,month=temp.month,day=temp.day,hour=0} ,os.time{year=temp.year,month=temp.month,day=temp.day,hour=0} ,50,90,90,'array') then lp.Print("\t\tIP : ",array.ip,"\n"); lp.Print("\t\tName : ",array.name,"\n"); lp.Print("\t\tpolling_type : ",array.polling_type,"\n"); lp.Print("\t\tCollect for periode : ",array.periode,"%\n"); lp.Print("\t\t\tAvaibility : ",array.avaibility,"%\n"); lp.Print("\t\t\tPerformance : ",array.performance,"%\n"); lp.Print("\t\tGood_polling : ",array.good_polling,"\n"); lp.Print("\t\tTotal_collected : ",array.total_collected,"\n"); lp.Print("\t\tTotal_waited : ",array.total_waited,"\n");
end lp.Print("SLA no collected for host : ",array.ip," \n"); end end end lp.Print("Scan Ended\n"); lp_buffer ="ok"; end end lp.Print("Scan Ended\n"); lp_buffer ="ok";
|
| Newsletter | Forum | Contact Us | News | Products | Downloads | Support | Partners | |||||
|
|||||