Nagios Bug and Feature Tracker
Bug and Feature Tracker

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0000184 [Nagios Core] Check Scheduling major always 2010-12-20 12:55 2014-03-14 14:57
Reporter rodneyra View Status public  
Assigned To ageric
Priority normal Resolution fixed  
Status closed   Product Version
Summary 0000184: freshness_threshold problem in host checks
Description We realised that the freshness_threshold does not use the retry_interval in host checks, when freshness_threshold=0 and our colletor goes down. It always uses the check_interval. So, one host can take a long time to have your state change to HARD.

On the other hand, the service checks works as expected, because it uses the retry_interval, when the state type = HARD or current state = OK, and it uses check_interval on other cases.

We can see this clearly when we look inside the base/checks.c source code:

[...]
/* tests whether or not a service's check results are fresh */
int is_service_result_fresh(service *temp_service, time_t current_time,
int log_this){
[...]
  /* use user-supplied freshness threshold or auto-calculate a
freshness threshold to use? */
  if(temp_service->freshness_threshold==0){
     if(temp_service->state_type==HARD_STATE || temp_service->current_state==STATE_OK)

freshness_threshold=(temp_service->check_interval*interval_length)+temp_service->latency+additional_freshness_latency;
     else

freshness_threshold=(temp_service->retry_interval*interval_length)+temp_service->latency+additional_freshness_latency;
     }
  else
     freshness_threshold=temp_service->freshness_threshold;
[...]
/* checks to see if a hosts's check results are fresh */
int is_host_result_fresh(host *temp_host, time_t current_time, int
log_this){
[...]
  /* use user-supplied freshness threshold or auto-calculate a
freshness threshold to use? */
  if(temp_host->freshness_threshold==0)

freshness_threshold=(temp_host->check_interval*interval_length)+temp_host->latency+additional_freshness_latency;
  else
     freshness_threshold=temp_host->freshness_threshold;
[...]
Additional Information I change the lines in the checks.c as below:

========================================================
FROM: Lines 2439 - 2440


 if(temp_host->freshness_threshold==0)
                freshness_threshold=(temp_host->check_interval*interval_length)+temp_host->latency+additional_freshness_latency;

TO:

        if(temp_host->freshness_threshold==0){
                if(temp_host->state_type==HARD_STATE || temp_host->current_state==STATE_OK)

                        freshness_threshold=(temp_host->check_interval*interval_length)+temp_host->latency+additional_freshness_latency;
                else
                        freshness_threshold=(temp_host->retry_interval*interval_length)+temp_host->latency+additional_freshness_latency;
                }
========================================================

It is working well, as expected. My retry interval is 1 minute and hosts are taking about 2 minutes to change SOFT states.

I´m sending attached the diff file between the original checks.c file and the modified checks.c file.
Tags No tags attached.
Nagios Version 3.2.3
OS Linux
OS Version Red Hat Enterprise Linux Server release 5.2 (Tikanga)
Attached Files txt file icon checks_diff.txt [^] (703 bytes) 2010-12-20 12:55

- Relationships

-  Notes
(0000292)
ageric (reporter)
2011-05-10 09:49

commit a50a0c87ac72e481879a5acb67aa4d440f32ef4a
Author: Andreas Ericsson <ae@op5.se>
Date: Tue May 10 14:48:49 2011 +0000

    Use retry_interval for host freshness (re)checking
    
    Previously the normal check_interval was used regardless of the
    state of the host. This contradicted the documented behaviour
    and the behaviour of services, so fix hosts to work as people
    expect them to.
    
    This fixes issue 0000184, which suggests a slightly different patch.
    
    Reported-by: Rodney Ramos <rodneyra@gmail.com>
    Original-patch-by: Rodney Ramos <rodneyra@gmail.com>
    Signed-off-by: Andreas Ericsson <ae@op5.se>

- Issue History
Date Modified Username Field Change
2010-12-20 12:55 rodneyra New Issue
2010-12-20 12:55 rodneyra File Added: checks_diff.txt
2010-12-20 12:55 rodneyra Nagios Version => 3.2.3
2010-12-20 12:55 rodneyra OS => Linux
2010-12-20 12:55 rodneyra OS Version => Red Hat Enterprise Linux Server release 5.2 (Tikanga)
2011-05-10 09:49 ageric Status new => assigned
2011-05-10 09:49 ageric Assigned To => ageric
2011-05-10 09:49 ageric Note Added: 0000292
2011-05-10 09:49 ageric Status assigned => resolved
2011-05-10 09:49 ageric Resolution open => fixed
2014-03-14 14:57 estanley Status resolved => closed


Mantis 1.1.7[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker