Docker - check_docker not working since upgrade

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
NMFSTeam
Posts: 88
Joined: Thu Nov 12, 2015 9:01 am

Docker - check_docker not working since upgrade

Post by NMFSTeam »

We have upgraded from Nagios XI 5.5.x to 5.6.2 (and now to 5.6.3), and ever since the upgrade to 5.6.x, the Docker checks are no longer working, getting errors...

(No output on stdout) stderr: Traceback (most recent call last):

Code: Select all

File "/usr/local/nagios/libexec/check_docker.py", line 988, in <module>
_ = main()
File "/usr/local/nagios/libexec/check_docker.py", line 959, in main
checks = choose_checks(options)
File "/usr/local/nagios/libexec/check_docker.py", line 226, in choose_checks
check_data = get_threshold_maps(options.warning, options.critical, selection)
File "/usr/local/nagios/libexec/check_docker.py", line 329, in get_threshold_maps
for triplet in zip(attrs.keys(), warning_list, critical_list):
AttributeError: 'list' object has no attribute 'keys'
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Docker - check_docker not working since upgrade

Post by mcapra »

Former Nagios employee
https://www.mcapra.com/
npolovenko
Support Tech
Posts: 3457
Joined: Mon May 15, 2017 5:00 pm

Re: Docker - check_docker not working since upgrade

Post by npolovenko »

@mblower, Please download the patched version of the plugin from another thread and let us know if it fixes the issue.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
NMFSTeam
Posts: 88
Joined: Thu Nov 12, 2015 9:01 am

Re: Docker - check_docker not working since upgrade

Post by NMFSTeam »

I have downloaded the plugin and gotten it to work, but one of the checks doesn't appear to be functioning properly.

The check "Docker - Containers Are Running" is showing a warning, even though at least 4 containers are in fact running on that particular server.

WARNING: 0 running, 22 not running, containers not running: ['/ol7_app1_test_service.1.pr8yh5i4x2n6jbqjqz7xz1hdu', '/ol7_app1_dev_service.1.xcqd74scxrcwdo9owvyv0xpus', '/ol7_app2_test_service.1.ip71a12mzkluamne7qtjs7qlv', '/ol7_app2_dev_service.1.p1tf4gpbd8
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Docker - check_docker not working since upgrade

Post by scottwilkerson »

Can you add the --debug flag to the command, then look for the line beginning with curl

You should be able to execute that curl command and get the json returned from docker
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
mbarislug
Posts: 4
Joined: Thu Oct 04, 2018 4:00 pm

Re: Docker - check_docker not working since upgrade

Post by mbarislug »

Hello,

We too have the exact same issue as mblower. Upon replacing the check_docker.py file there is no longer a callback error, but the resulting check is invalid.

For example:
I have one docker container running at the moment:
CONTAINER ID IMAGE COMMAND CREATED STATUS PORTS NAMES
69527ab31b70 jupyterhub/jupyterhub "jupyterhub" 14 minutes ago Up 14 minutes 8000/tcp competent_brahmagupta

However the updated check_docker.py script indicates it is not running:
WARNING: 0 running, 40 not running, containers not running: ['/competent_brahmagupta', '/brave_turing', '/angry_kapitsa', '/zen_shtern', '/hungry_shannon', '/nervous_sanderson', '/wizardly_dijkstra', '/admiring_agnesi', '/boring_montalcini', '/elastic_men

Here is the information with the debug command:
[root@nagiosxi libexec]# /usr/local/nagios/libexec/check_docker.py -H http://hostname:4243/ --check-type 'containers_running' --debug
all
{'total_usage': []}
End selection + type
hit threshold_string_to_tuple
hit threshold_string_to_tuple
hit get_all_container_IDs
hit get_all_containers
hit talk_to_docker
curl 'http://hostname:4243/containers/json?&all=1' -g -f
hit do_check
hit check_containers_running
hit talk_to_docker
curl 'http://hostname:4243/containers/json?&all=true' -g -f
hit process_value
hit process_counter
hit check_all_values_against_thresholds
hit check_against_thresholds
hit check_against_threshold
hit check_against_threshold
hit nagios_exit
OK: 0 running, 40 not running | total_usage=0;50;75

If I manually curl that curl 'http://hostnamer:4243/containers/json?&all=1' -g -f I do indeed get data back, and visiting http://hostname:4243/containers/json?all=true shows thae state of the container running.

Hope this helps.


EDIT-----Found the culprit for check_containers_running -- line 681 in check_docker.py needs to be if container["State"] == "running": instead of if container["State"] == "Running":
User avatar
lmiltchev
Former Nagios Staff
Posts: 13587
Joined: Mon May 23, 2011 12:15 pm

Re: Docker - check_docker not working since upgrade

Post by lmiltchev »

EDIT-----Found the culprit for check_containers_running -- line 681 in check_docker.py needs to be if container["State"] == "running": instead of if container["State"] == "Running":
I am glad you were able to resolve the issue. Our developers are aware of the issue, and will be fixing it soon. Thank you!
Be sure to check out our Knowledgebase for helpful articles and solutions!
NMFSTeam
Posts: 88
Joined: Thu Nov 12, 2015 9:01 am

Re: Docker - check_docker not working since upgrade

Post by NMFSTeam »

After applying the second fix mentioned in this post, the check is working as expected. Thank you very much everyone!
scottwilkerson
DevOps Engineer
Posts: 19396
Joined: Tue Nov 15, 2011 3:11 pm
Location: Nagios Enterprises
Contact:

Re: Docker - check_docker not working since upgrade

Post by scottwilkerson »

mblower wrote:After applying the second fix mentioned in this post, the check is working as expected. Thank you very much everyone!
Great!

Locking
Former Nagios employee
Creator:
ahumandesign.com
enneagrams.com
Locked