Getting "Error: Could not read host and service status".
Getting "Error: Could not read host and service status".
Hi I am getting the "Error: Could not read host and service status information!" often on refreshes. If I hit the refresh button a few times, it goes away. But, this is happening with enough frequency that people are beginning to talk of finding a more stable monitoring application.
Any ideas on how to resolve this?
Gavin
Any ideas on how to resolve this?
Gavin
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Getting "Error: Could not read host and service status".
Can you please provide a screenshot of this problem.
How many hosts / services are you monitoring?
How many hosts / services are you monitoring?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Getting "Error: Could not read host and service status".
I can give a screenshot when it happens, but its the standard error message thats provided by Nagios.
I am currently a farm of roughly 750 servers, more added in phases. We have an additional 50 coming shortly.
Here is an error from my logfile;
Mon Oct 19 08:48:02.605278 2015] [cgi:warn] [pid 52159] [client 10.2.166.227:58106] AH01220: Timeout waiting for output from CGI script /usr/lib/cgi-bin/nagios3/cmd.cgi, referer: http://mobile-mon-02/cgi-bin/nagios3/cm ... orce_check
[Mon Oct 19 08:48:02.605365 2015] [cgi:error] [pid 52159] [client 10.2.166.227:58106] Script timed out before returning headers: cmd.cgi, referer: http://mobile-mon-02/cgi-bin/nagios3/cm ... orce_check
[Mon Oct 19 08:53:02.705436 2015] [cgi:warn] [pid 52159] [client 10.2.166.227:58106] AH01220: Timeout waiting for output from CGI script /usr/lib/cgi-bin/nagios3/cmd.cgi, referer: http://mobile-mon-02/cgi-bin/nagios3/cm ... orce_check
I am currently a farm of roughly 750 servers, more added in phases. We have an additional 50 coming shortly.
Here is an error from my logfile;
Mon Oct 19 08:48:02.605278 2015] [cgi:warn] [pid 52159] [client 10.2.166.227:58106] AH01220: Timeout waiting for output from CGI script /usr/lib/cgi-bin/nagios3/cmd.cgi, referer: http://mobile-mon-02/cgi-bin/nagios3/cm ... orce_check
[Mon Oct 19 08:48:02.605365 2015] [cgi:error] [pid 52159] [client 10.2.166.227:58106] Script timed out before returning headers: cmd.cgi, referer: http://mobile-mon-02/cgi-bin/nagios3/cm ... orce_check
[Mon Oct 19 08:53:02.705436 2015] [cgi:warn] [pid 52159] [client 10.2.166.227:58106] AH01220: Timeout waiting for output from CGI script /usr/lib/cgi-bin/nagios3/cmd.cgi, referer: http://mobile-mon-02/cgi-bin/nagios3/cm ... orce_check
- Box293
- Too Basu
- Posts: 5126
- Joined: Sun Feb 07, 2010 10:55 pm
- Location: Deniliquin, Australia
- Contact:
Re: Getting "Error: Could not read host and service status".
I have some ideas as to what is happening, but first let me get some more information.
What is the output of:
What is the output of:
Code: Select all
top -n 1
df -h
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Re: Getting "Error: Could not read host and service status".
Code: Select all
Tasks: 487 total, 1 running, 486 sleeping, 0 stopped, 0 zombie
%Cpu(s): 0.0 us, 0.2 sy, 0.2 ni, 99.5 id, 0.0 wa, 0.0 hi, 0.0 si, 0.0 st
KiB Mem: 98967472 total, 22276932 used, 76690544 free, 320460 buffers
KiB Swap: 10062643+total, 0 used, 10062643+free. 5436064 cached Mem
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
56539 gavinh 20 0 25240 1920 1084 R 11.8 0.0 0:00.03 top
1 root 20 0 33636 2912 1472 S 0.0 0.0 2:04.45 init
2 root 20 0 0 0 0 S 0.0 0.0 0:00.04 kthreadd
3 root 20 0 0 0 0 S 0.0 0.0 3:33.60 ksoftirqd/0
4 root 20 0 0 0 0 S 0.0 0.0 16:28.13 kworker/0:0
5 root 0 -20 0 0 0 S 0.0 0.0 0:00.00 kworker/0:0H
6 root 20 0 0 0 0 S 0.0 0.0 0:00.00 kworker/u128:0
7 root 20 0 0 0 0 S 0.0 0.0 11:21.77 kworker/u129:0
8 root 20 0 0 0 0 S 0.0 0.0 116:23.18 rcu_sched
9 root 20 0 0 0 0 S 0.0 0.0 12:57.67 rcuos/0
10 root 20 0 0 0 0 S 0.0 0.0 16:42.86 rcuos/1
11 root 20 0 0 0 0 S 0.0 0.0 16:20.89 rcuos/2
12 root 20 0 0 0 0 S 0.0 0.0 13:16.49 rcuos/3
13 root 20 0 0 0 0 S 0.0 0.0 12:42.85 rcuos/4
14 root 20 0 0 0 0 S 0.0 0.0 17:42.23 rcuos/5
15 root 20 0 0 0 0 S 0.0 0.0 18:51.60 rcuos/6
16 root 20 0 0 0 0 S 0.0 0.0 25:04.29 rcuos/7
17 root 20 0 0 0 0 S 0.0 0.0 16:33.80 rcuos/8
18 root 20 0 0 0 0 S 0.0 0.0 13:29.25 rcuos/9
19 root 20 0 0 0 0 S 0.0 0.0 10:17.25 rcuos/10
20 root 20 0 0 0 0 S 0.0 0.0 8:19.09 rcuos/11
21 root 20 0 0 0 0 S 0.0 0.0 14:35.73 rcuos/12
22 root 20 0 0 0 0 S 0.0 0.0 12:20.56 rcuos/13
23 root 20 0 0 0 0 S 0.0 0.0 12:12.51 rcuos/14
24 root 20 0 0 0 0 S 0.0 0.0 12:00.18 rcuos/15
25 root 20 0 0 0 0 S 0.0 0.0 8:21.59 rcuos/16
26 root 20 0 0 0 0 S 0.0 0.0 7:52.82 rcuos/17
27 root 20 0 0 0 0 S 0.0 0.0 7:38.31 rcuos/18
28 root 20 0 0 0 0 S 0.0 0.0 7:08.39 rcuos/19
29 root 20 0 0 0 0 S 0.0 0.0 5:31.72 rcuos/20
30 root 20 0 0 0 0 S 0.0 0.0 15:38.09 rcuos/21
31 root 20 0 0 0 0 S 0.0 0.0 10:39.78 rcuos/22
32 root 20 0 0 0 0 S 0.0 0.0 10:13.09 rcuos/23
33 root 20 0 0 0 0 S 0.0 0.0 9:01.40 rcuos/24
34 root 20 0 0 0 0 S 0.0 0.0 8:04.82 rcuos/25
35 root 20 0 0 0 0 S 0.0 0.0 7:20.06 rcuos/26
36 root 20 0 0 0 0 S 0.0 0.0 6:23.97 rcuos/27
37 root 20 0 0 0 0 S 0.0 0.0 6:16.82 rcuos/28
38 root 20 0 0 0 0 S 0.0 0.0 8:37.39 rcuos/29
39 root 20 0 0 0 0 S 0.0 0.0 5:28.77 rcuos/30
40 root 20 0 0 0 0 S 0.0 0.0 15:06.59 rcuos/31
41 root 20 0 0 0 0 S 0.0 0.0 9:21.85 rcuos/32
42 root 20 0 0 0 0 S 0.0 0.0 8:41.83 rcuos/33
43 root 20 0 0 0 0 S 0.0 0.0 7:49.56 rcuos/34
44 root 20 0 0 0 0 S 0.0 0.0 7:00.99 rcuos/35
45 root 20 0 0 0 0 S 0.0 0.0 6:13.11 rcuos/36
46 root 20 0 0 0 0 S 0.0 0.0 5:27.05 rcuos/37
47 root 20 0 0 0 0 S 0.0 0.0 4:51.97 rcuos/38
48 root 20 0 0 0 0 S 0.0 0.0 7:07.31 rcuos/39
49 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/40
50 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/41
51 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/42
52 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/43
53 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/44
54 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/45
55 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/46
56 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/47
57 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/48
58 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/49
59 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/50
60 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/51
61 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/52
62 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/53
63 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/54
64 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/55
65 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/56
66 root 20 0 0 0 0 S 0.0 0.0 0:00.00 rcuos/57
Code: Select all
/dev/mapper/mobile--mon--02--vg-root 5.4T 15G 5.1T 1% /
none 4.0K 0 4.0K 0% /sys/fs/cgroup
udev 48G 4.0K 48G 1% /dev
tmpfs 9.5G 1004K 9.5G 1% /run
none 5.0M 0 5.0M 0% /run/lock
none 48G 0 48G 0% /run/shm
none 100M 0 100M 0% /run/user
/dev/sda2 237M 39M 187M 18% /boot
Re: Getting "Error: Could not read host and service status".
What Core version are you running? It looks like the interface has been modified.
Former Nagios employee
Re: Getting "Error: Could not read host and service status".
I am running 3.5.1 - yes I changed the colors some, pretty straight change of color hex code.
-
- Skynet Drone
- Posts: 2620
- Joined: Wed Feb 11, 2015 1:56 pm
Re: Getting "Error: Could not read host and service status".
This is the kind of error we usually see on long running CGIs, not cmd.cgi. What page(s) specifically are you trying to load when you get the failure? Is the failure happening after 30 seconds (which I believe is the default httpd script execution timeout) or right away?gavinh wrote:Script timed out before returning headers: cmd.cgi, referer: http://mobile-mon-02/cgi-bin/nagios3/cm ... orce_check
It may just be a matter of increasing the script execution timeout in httpd. Alternatively you might be better serviced getting status.dat onto an SSD or tmpfs.
Re: Getting "Error: Could not read host and service status".
I'll check into the httpd.