|Anonymous | Login | Signup for a new account||2013-06-19 08:32 EDT|
|Main | My View | View Issues | Change Log | Roadmap|
|Viewing Issue Simple Details|
|ID||Category||Severity||Reproducibility||Date Submitted||Last Update|
|0000343||[Nagios Core] Web Interface||minor||always||2012-06-04 10:53||2012-06-14 16:46|
|Summary||0000343: Service Groups / Summary page displays wrong totals under "Service Status Summary"|
When viewing the "Service Groups" / "Summary" page (titled "Status Summary For All Service Groups") - the counts listed under "Host Status Summary" are correct, but not for "Service Status Summary". I.E., a box may show "20 OK", but clicking on it will only list 10 services (including "Results 1 - 10 of 10 Matching Services" displayed at the bottom).
This appears to be a very similar to http://tracker.nagios.org/view.php?id=72 [^] - just in a different portion of the web UI.
This appears to be a bug in status.cgi, in the show_servicegroup_service_totals_summary function. This function is almost a clone of show_servicegroup_host_totals_summary - just dealing with services instead of hosts. (Seems like a good opportunity for some re-factoring here...) The host function includes a check for "skip this if it isn't a new host" - but this check doesn't exist in the service function. Patching the service function with the same check appears to fix the issue, without causing any additional issues. I've not examined the code thoroughly enough yet to properly explain why this works - but I'm assuming it relies on the list of members always coming back in a sorted order.
The patch I am using is attached.
|Tags||No tags attached.|
status-show_servicegroup_service_totals_summary.patch [^] (785 bytes) 2012-06-04 10:53
serviceCountTest.cfg [^] (3,236 bytes) 2012-06-14 09:44
|I've been trying to reproduce this issue, but so far I have not been able to...|
Please review the attached serviceCountTest.cfg, which can be used to reproduce. This is a Nagios configuration file that is as minimal as I can make it. As long as you have a "linux-server" host template, "remote-service" service template, and a "check-host-alive" check command (defined in the default, "template" configuratoin files) - even if they don't do anything, you should easily be able to include this file into a test instance using cfg_file.
This file contains a total of 8 services. This mimics 2 production environments, each with an application and database tier, each with 2 nodes each. There is one top-level service group to easily allow the entire environment to have its notifications disabled, downtime scheduled, etc. - assuming there are multiple other similar configurations on the same Nagios instance.
An unpatched Nagios instance current displays this as the following (incorrectly), showing both the host status summary count + service status summary count:
serviceCountTest-prod (serviceCountTest-prod) 1 PENDING 16 PENDING
serviceCountTest-prod1 (serviceCountTest-prod1) 1 PENDING 8 PENDING
serviceCountTest-prod1-app (serviceCountTest-prod1-app) 1 PENDING 2 PENDING
serviceCountTest-prod1-db (serviceCountTest-prod1-db) 1 PENDING 2 PENDING
serviceCountTest-prod2 (serviceCountTest-prod2) 1 PENDING 8 PENDING
serviceCountTest-prod2-app (serviceCountTest-prod2-app) 1 PENDING 2 PENDING
serviceCountTest-prod2-db (serviceCountTest-prod2-db) 1 PENDING 2 PENDING
Note that there are 3 incorrect service counts here. There are only a total of 8 services defined, so serviceCountTest-prod should be 8, not 16. Also, both serviceCountTest-prod1 and serviceCountTest-prod2 should be 4, not 8. The supplied patch fixes each of these.
Note also that if I repeated the same configuration to exercise the host groups instead of the service groups, the host counts would be correct. This is because show_servicegroup_host_totals_summary already contains the logic contained within the patch, which is missing from show_servicegroup_service_totals_summary.
Please advise if there is any other details I can provide to assist in resolving this issue.
Although I wasn't able to recreate this particular bug, it looks like this bug can occur under certain circumstances. The code looks good, and I couldn't break anything with the patch applied.
Committed to SVN trunk for 3.4.2 release.
|2012-06-04 10:53||ziesemer||New Issue|
|2012-06-04 10:53||ziesemer||File Added: status-show_servicegroup_service_totals_summary.patch|
|2012-06-04 10:53||ziesemer||Nagios Version||=> 3.4.1|
|2012-06-04 10:53||ziesemer||OS||=> RHEL|
|2012-06-04 10:53||ziesemer||OS Version||=> 6|
|2012-06-04 10:53||ziesemer||Issue Monitored: ziesemer|
|2012-06-12 17:24||mguthrie||Note Added: 0000483|
|2012-06-14 09:44||ziesemer||File Added: serviceCountTest.cfg|
|2012-06-14 09:57||ziesemer||Note Added: 0000487|
|2012-06-14 16:46||mguthrie||Note Added: 0000488|
|2012-06-14 16:46||mguthrie||Status||new => resolved|
|2012-06-14 16:46||mguthrie||Resolution||open => fixed|
|2012-06-14 16:46||mguthrie||Assigned To||=> mguthrie|
|Mantis 1.1.7[^] Copyright © 2000 - 2008 Mantis Group|