Page 1 of 1

Docker - check_docker not working since upgrade

Posted: Tue Jun 18, 2019 10:41 am
by NMFSTeam
We have upgraded from Nagios XI 5.5.x to 5.6.2 (and now to 5.6.3), and ever since the upgrade to 5.6.x, the Docker checks are no longer working, getting errors...

(No output on stdout) stderr: Traceback (most recent call last):

Code: Select all

File "/usr/local/nagios/libexec/check_docker.py", line 988, in <module>
_ = main()
File "/usr/local/nagios/libexec/check_docker.py", line 959, in main
checks = choose_checks(options)
File "/usr/local/nagios/libexec/check_docker.py", line 226, in choose_checks
check_data = get_threshold_maps(options.warning, options.critical, selection)
File "/usr/local/nagios/libexec/check_docker.py", line 329, in get_threshold_maps
for triplet in zip(attrs.keys(), warning_list, critical_list):
AttributeError: 'list' object has no attribute 'keys'

Re: Docker - check_docker not working since upgrade

Posted: Tue Jun 18, 2019 11:40 am
by mcapra

Re: Docker - check_docker not working since upgrade

Posted: Tue Jun 18, 2019 11:50 am
by npolovenko
@mblower, Please download the patched version of the plugin from another thread and let us know if it fixes the issue.

Re: Docker - check_docker not working since upgrade

Posted: Wed Jun 19, 2019 9:43 am
by NMFSTeam
I have downloaded the plugin and gotten it to work, but one of the checks doesn't appear to be functioning properly.

The check "Docker - Containers Are Running" is showing a warning, even though at least 4 containers are in fact running on that particular server.

WARNING: 0 running, 22 not running, containers not running: ['/ol7_app1_test_service.1.pr8yh5i4x2n6jbqjqz7xz1hdu', '/ol7_app1_dev_service.1.xcqd74scxrcwdo9owvyv0xpus', '/ol7_app2_test_service.1.ip71a12mzkluamne7qtjs7qlv', '/ol7_app2_dev_service.1.p1tf4gpbd8

Re: Docker - check_docker not working since upgrade

Posted: Wed Jun 19, 2019 11:30 am
by scottwilkerson
Can you add the --debug flag to the command, then look for the line beginning with curl

You should be able to execute that curl command and get the json returned from docker

Re: Docker - check_docker not working since upgrade

Posted: Tue Jun 25, 2019 5:09 pm
by mbarislug
Hello,

We too have the exact same issue as mblower. Upon replacing the check_docker.py file there is no longer a callback error, but the resulting check is invalid.

For example:
I have one docker container running at the moment:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
69527ab31b70 jupyterhub/jupyterhub "jupyterhub" 14 minutes ago Up 14 minutes 8000/tcp competent_brahmagupta

However the updated check_docker.py script indicates it is not running:
WARNING: 0 running, 40 not running, containers not running: ['/competent_brahmagupta', '/brave_turing', '/angry_kapitsa', '/zen_shtern', '/hungry_shannon', '/nervous_sanderson', '/wizardly_dijkstra', '/admiring_agnesi', '/boring_montalcini', '/elastic_men

Here is the information with the debug command:
[root@nagiosxi libexec]# /usr/local/nagios/libexec/check_docker.py -H http://hostname:4243/ --check-type 'containers_running' --debug
all
{'total_usage': []}
End selection + type
hit threshold_string_to_tuple
hit threshold_string_to_tuple
hit get_all_container_IDs
hit get_all_containers
hit talk_to_docker
curl 'http://hostname:4243/containers/json?&all=1' -g -f
hit do_check
hit check_containers_running
hit talk_to_docker
curl 'http://hostname:4243/containers/json?&all=true' -g -f
hit process_value
hit process_counter
hit check_all_values_against_thresholds
hit check_against_thresholds
hit check_against_threshold
hit check_against_threshold
hit nagios_exit
OK: 0 running, 40 not running | total_usage=0;50;75

If I manually curl that curl 'http://hostnamer:4243/containers/json?&all=1' -g -f I do indeed get data back, and visiting http://hostname:4243/containers/json?all=true shows thae state of the container running.

Hope this helps.


EDIT-----Found the culprit for check_containers_running -- line 681 in check_docker.py needs to be if container["State"] == "running": instead of if container["State"] == "Running":

Re: Docker - check_docker not working since upgrade

Posted: Wed Jun 26, 2019 1:33 pm
by lmiltchev
EDIT-----Found the culprit for check_containers_running -- line 681 in check_docker.py needs to be if container["State"] == "running": instead of if container["State"] == "Running":
I am glad you were able to resolve the issue. Our developers are aware of the issue, and will be fixing it soon. Thank you!

Re: Docker - check_docker not working since upgrade

Posted: Mon Jul 01, 2019 5:07 pm
by NMFSTeam
After applying the second fix mentioned in this post, the check is working as expected. Thank you very much everyone!

Re: Docker - check_docker not working since upgrade

Posted: Tue Jul 02, 2019 7:10 am
by scottwilkerson
mblower wrote:After applying the second fix mentioned in this post, the check is working as expected. Thank you very much everyone!
Great!

Locking