Nagios Bug and Feature Tracker
Bug and Feature Tracker

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0000343 [Nagios Core] Web Interface minor always 2012-06-04 10:53 2012-06-14 16:46
Reporter ziesemer View Status public  
Assigned To mguthrie
Priority normal Resolution fixed  
Status resolved   Product Version
Summary 0000343: Service Groups / Summary page displays wrong totals under "Service Status Summary"
Description When viewing the "Service Groups" / "Summary" page (titled "Status Summary For All Service Groups") - the counts listed under "Host Status Summary" are correct, but not for "Service Status Summary". I.E., a box may show "20 OK", but clicking on it will only list 10 services (including "Results 1 - 10 of 10 Matching Services" displayed at the bottom).

This appears to be a very similar to http://tracker.nagios.org/view.php?id=72 [^] - just in a different portion of the web UI.
Additional Information This appears to be a bug in status.cgi, in the show_servicegroup_service_totals_summary function. This function is almost a clone of show_servicegroup_host_totals_summary - just dealing with services instead of hosts. (Seems like a good opportunity for some re-factoring here...) The host function includes a check for "skip this if it isn't a new host" - but this check doesn't exist in the service function. Patching the service function with the same check appears to fix the issue, without causing any additional issues. I've not examined the code thoroughly enough yet to properly explain why this works - but I'm assuming it relies on the list of members always coming back in a sorted order.

The patch I am using is attached.
Tags No tags attached.
Nagios Version 3.4.1
OS RHEL
OS Version 6
Attached Files ? file icon status-show_servicegroup_service_totals_summary.patch [^] (785 bytes) 2012-06-04 10:53
? file icon serviceCountTest.cfg [^] (3,236 bytes) 2012-06-14 09:44

- Relationships

-  Notes
(0000483)
mguthrie (administrator)
2012-06-12 17:24

I've been trying to reproduce this issue, but so far I have not been able to...
(0000487)
ziesemer (reporter)
2012-06-14 09:57

Please review the attached serviceCountTest.cfg, which can be used to reproduce. This is a Nagios configuration file that is as minimal as I can make it. As long as you have a "linux-server" host template, "remote-service" service template, and a "check-host-alive" check command (defined in the default, "template" configuratoin files) - even if they don't do anything, you should easily be able to include this file into a test instance using cfg_file.

This file contains a total of 8 services. This mimics 2 production environments, each with an application and database tier, each with 2 nodes each. There is one top-level service group to easily allow the entire environment to have its notifications disabled, downtime scheduled, etc. - assuming there are multiple other similar configurations on the same Nagios instance.

An unpatched Nagios instance current displays this as the following (incorrectly), showing both the host status summary count + service status summary count:

serviceCountTest-prod (serviceCountTest-prod) 1 PENDING 16 PENDING
serviceCountTest-prod1 (serviceCountTest-prod1) 1 PENDING 8 PENDING
serviceCountTest-prod1-app (serviceCountTest-prod1-app) 1 PENDING 2 PENDING
serviceCountTest-prod1-db (serviceCountTest-prod1-db) 1 PENDING 2 PENDING
serviceCountTest-prod2 (serviceCountTest-prod2) 1 PENDING 8 PENDING
serviceCountTest-prod2-app (serviceCountTest-prod2-app) 1 PENDING 2 PENDING
serviceCountTest-prod2-db (serviceCountTest-prod2-db) 1 PENDING 2 PENDING

Note that there are 3 incorrect service counts here. There are only a total of 8 services defined, so serviceCountTest-prod should be 8, not 16. Also, both serviceCountTest-prod1 and serviceCountTest-prod2 should be 4, not 8. The supplied patch fixes each of these.

Note also that if I repeated the same configuration to exercise the host groups instead of the service groups, the host counts would be correct. This is because show_servicegroup_host_totals_summary already contains the logic contained within the patch, which is missing from show_servicegroup_service_totals_summary.

Please advise if there is any other details I can provide to assist in resolving this issue.

Thanks!
(0000488)
mguthrie (administrator)
2012-06-14 16:46

Although I wasn't able to recreate this particular bug, it looks like this bug can occur under certain circumstances. The code looks good, and I couldn't break anything with the patch applied.

Committed to SVN trunk for 3.4.2 release.

- Issue History
Date Modified Username Field Change
2012-06-04 10:53 ziesemer New Issue
2012-06-04 10:53 ziesemer File Added: status-show_servicegroup_service_totals_summary.patch
2012-06-04 10:53 ziesemer Nagios Version => 3.4.1
2012-06-04 10:53 ziesemer OS => RHEL
2012-06-04 10:53 ziesemer OS Version => 6
2012-06-04 10:53 ziesemer Issue Monitored: ziesemer
2012-06-12 17:24 mguthrie Note Added: 0000483
2012-06-14 09:44 ziesemer File Added: serviceCountTest.cfg
2012-06-14 09:57 ziesemer Note Added: 0000487
2012-06-14 16:46 mguthrie Note Added: 0000488
2012-06-14 16:46 mguthrie Status new => resolved
2012-06-14 16:46 mguthrie Resolution open => fixed
2012-06-14 16:46 mguthrie Assigned To => mguthrie


Mantis 1.1.7[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker