Nagios Bug and Feature Tracker
Bug and Feature Tracker

Viewing Issue Simple Details Jump to Notes ] View Advanced ] Issue History ] Print ]
ID Category Severity Reproducibility Date Submitted Last Update
0000308 [Nagios Plugins] Other / Unknown major always 2012-03-25 16:29 2012-09-18 09:12
Reporter calestyo View Status public  
Assigned To
Priority normal Resolution open  
Status closed  
Summary 0000308: check_by_ssh doesn't work with SSH ControlMaster auto spawning
Description Hi.

I recently did some investigation whether NRPE can be fully replaced by SSH.
Fully of course with respect to security (well NRPE has absolutely no security at all) and performance (in terms of speed).
It turns out that today's SSH is (with the right configuration) actually able to perform as fast or even faster than NRPE.
I'll post more on this later on the nagios mailing list.

Inherently important for speeding SSH up is the ControlMaster functionality, where a first SSH connection is multiplexed and reused by subsequent one.

There is a feature (via the ControlMaster=auto AND(!!!) ControlPersist setting) that the spawning of the Control Master connection happens automatically.

This however does not work nicely with the check_by_ssh check.


The FIRST time (i.e. when the ControlMaster connection is created) the plugin always runs in the timeout and gives a critical error.
On subsequent times, it works (as the control master connection already) exists.

I guess the reasons is, that check_by_ssh waits for all (created/forked processes) to exit.


So how to reproduce:
- I'm using ssh-keys to connect to the remote host (guess anyone can set this up himself).
- The ~/.ssh/config is about like this:
Host *
  ControlPath ~/.ssh/master-%l-%r@%h:%p
  ControlMaster auto
  ControlPersist yes

The command is:
/usr/lib/nagios/plugins/check_by_ssh -q -o RequestTTY=yes -H host.example.org -l nagtest -C "/usr/lib/nagios/plugins/check_load -w 15,10,5 -c 30,25,20"

On the first invocation one gets:
CRITICAL - Plugin timed out after 10 seconds

Subsequent ones work:
OK - load average: 0.00, 0.01, 0.00|load1=0.000;15.000;30.000;0; load5=0.010;10.000;25.000;0; load15=0.000;5.000;20.000;0;


A note:
The command master thingy and the auto spawning does NOT make ssh blocking, i.e. if you call manually:
# ssh -tq nagtest@host.example.org /usr/lib/nagios/plugins/check_load -w 15,10,5 -c 30,25,20
OK - load average: 0.00, 0.00, 0.00|load1=0.000;15.000;30.000;0; load5=0.000;10.000;25.000;0; load15=0.000;5.000;20.000;0;
root@lcg-lrz-monitoring:~#
it returns, while the control master stays:
# ps ax | grep mux
18300 ? Ss 0:00 ssh: /root/.ssh/master-lcg-lrz-monitoring-nagtest@lcg-lrz-dc20.grid.lrz-muenchen.de:22 [mux]


Another note:
check_by_ssh's "-f" option doesn't help at all, actually it comepltely breaks creating the control master.


Cheers,
Chris.
Additional Information
Tags No tags attached.
OS
OS Version
Attached Files

- Relationships

-  Notes
(0000495)
calestyo (reporter)
2012-07-01 15:14

This one is actually a bug in Nagios/Icinga and has been moved to:
http://tracker.nagios.org/view.php?id=321 [^]
respectively:
https://dev.icinga.org/issues/2546) [^]

Therefore, please close the issue.

- Issue History
Date Modified Username Field Change
2012-03-25 16:29 calestyo New Issue
2012-07-01 15:14 calestyo Note Added: 0000495
2012-09-18 09:12 ageric Status new => closed


Mantis 1.1.7[^]
Copyright © 2000 - 2008 Mantis Group
Powered by Mantis Bugtracker