Page 1 of 1

Another "read() returned error 11" thread.

Posted: Thu Aug 06, 2020 9:29 am
by invade
Hi. I can see this issue being raised in various threads, but none appear to exactly match our situation and / or have an explanation / solution.

On our system (Nagios Core 4.4.5 on CentOS 8), the host_notification_commands and service_notification_commands settings run a Nagios command that calls the mailx OS command. Every time this command runs, Nagios logs a "read() returned error 11" message to the systemd journal.

Everything actually works as far as I can tell, I cannot reproduce the error from a bash shell, the debug logs shows nothing of note that I can tell and other OS command I've tested do not cause the problem.

Is this Nagios "reading" the output from the OS command being executed?

Is the "error 11" coming from the OS command, or is Nagios generating this as a result of the response from the OS command?

Is there anything else I can do to identify the cause of the error message?

Thanks in advance.

Extract from the journal:

Code: Select all

Aug 04 16:55:27 nagios.example.com nagios[14878]: job 2 (pid=216469): read() returned error 11
Aug 04 16:55:27 nagios.example.com postfix/pickup[214440]: E25E71811B45: uid=995 from=<[email protected]>
Aug 04 16:55:27 nagios.example.com postfix/cleanup[216474]: E25E71811B45: message-id=<5f2aa0b2.5Xd8B6Uju3bZTlWM%[email protected]>
Aug 04 16:55:27 nagios.example.com postfix/qmgr[11905]: E25E71811B45: from=<[email protected]>, size=753, nrcpt=1 (queue active)
Aug 04 16:55:28 nagios.example.com postfix/smtp[216476]: E25E71811B45: to=<[email protected]>, relay=aspmx.l.google.com[2a00:1450:400c:c09::1a]:25, delay=1, delays=0.08/0.04/0.3/0.58, dsn=2.0.0, status=sent (250 2.0.0 OK  1596629171 m84si1416098wmf.180 - gsmtp)
Aug 04 16:55:28 nagios.example.com postfix/qmgr[11905]: E25E71811B45: removed
Extract from the debug log:

Code: Select all

[Tue Aug  4 16:55:27 2020.224436] [256.1] [pid=14878] Read raw external command '[1596556527] SEND_CUSTOM_SVC_NOTIFICATION;test-host;Test;0;user;Testing 136878522.'
[Tue Aug  4 16:55:27 2020.224455] [001.0] [pid=14878] process_external_command1()
[Tue Aug  4 16:55:27 2020.224475] [064.1] [pid=14878] Making callbacks (type 2)...
[Tue Aug  4 16:55:27 2020.224479] [064.1] [pid=14878] Making callbacks (type 17)...
[Tue Aug  4 16:55:27 2020.224483] [001.0] [pid=14878] process_external_command2()
[Tue Aug  4 16:55:27 2020.224486] [128.1] [pid=14878] External Command Type: 160
[Tue Aug  4 16:55:27 2020.224489] [128.1] [pid=14878] Command Entry Time: 1596556527
[Tue Aug  4 16:55:27 2020.224492] [128.1] [pid=14878] Command Arguments: test-host;Test;0;user;Testing 136878522.
[Tue Aug  4 16:55:27 2020.224498] [001.0] [pid=14878] service_notification()
[Tue Aug  4 16:55:27 2020.224514] [032.0] [pid=14878] ** Service Notification Attempt ** Host: 'test-host', Service: 'Test', Type: CUSTOM, Options: 0, Current State: 2, Last Notification: Thu Jan  1 01:00:00 1970
[Tue Aug  4 16:55:27 2020.224519] [001.0] [pid=14878] check_service_notification_viability()
[Tue Aug  4 16:55:27 2020.224522] [001.0] [pid=14878] check_time_against_period()
[Tue Aug  4 16:55:27 2020.224529] [001.0] [pid=14878] _get_matching_timerange()
[Tue Aug  4 16:55:27 2020.224537] [032.0] [pid=14878] Notification viability test passed.
[Tue Aug  4 16:55:27 2020.224540] [032.1] [pid=14878] Current notification number: 4 (unchanged)
[Tue Aug  4 16:55:27 2020.224544] [064.1] [pid=14878] Making callbacks (type 5)...
[Tue Aug  4 16:55:27 2020.224547] [001.0] [pid=14878] create_notification_list_from_service()
[Tue Aug  4 16:55:27 2020.224550] [001.0] [pid=14878] should_service_notification_be_escalated()
[Tue Aug  4 16:55:27 2020.224553] [032.1] [pid=14878] Service notification will NOT be escalated.
[Tue Aug  4 16:55:27 2020.224557] [032.1] [pid=14878] Adding normal contacts for service to notification list.
[Tue Aug  4 16:55:27 2020.224560] [001.0] [pid=14878] check_contact_service_notification_viability()
[Tue Aug  4 16:55:27 2020.224564] [001.0] [pid=14878] check_time_against_period()
[Tue Aug  4 16:55:27 2020.224569] [001.0] [pid=14878] _get_matching_timerange()
[Tue Aug  4 16:55:27 2020.224575] [001.0] [pid=14878] add_notification() start
[Tue Aug  4 16:55:27 2020.224578] [001.0] [pid=14878] find_notification() start
[Tue Aug  4 16:55:27 2020.224586] [001.0] [pid=14878] notify_contact_of_service()
[Tue Aug  4 16:55:27 2020.224590] [064.1] [pid=14878] Making callbacks (type 20)...
[Tue Aug  4 16:55:27 2020.224600] [064.1] [pid=14878] Making callbacks (type 21)...
[Tue Aug  4 16:55:27 2020.224605] [001.0] [pid=14878] get_raw_command_line_r()
[Tue Aug  4 16:55:27 2020.224611] [001.0] [pid=14878] process_macros_r()
[Tue Aug  4 16:55:27 2020.224614] [2048.1] [pid=14878] **** BEGIN MACRO PROCESSING ***********
[Tue Aug  4 16:55:27 2020.224631] [2048.1] [pid=14878] Processing: '/usr/bin/printf "%b" "Date/Time: $LONGDATETIME$\nHost: $HOSTNAME$\nAddress: $HOSTADDRESS$\nService: $SERVICEDESC$\nState: $SERVICESTATE$\nInfo:\n$SERVICEOUTPUT$" | /usr/bin/mailx -s "Service $SERVICEDESC$ is $SERVICESTATE$ on $HOSTNAME$" -r [email protected] $CONTACTEMAIL$'
[Tue Aug  4 16:55:27 2020.224662] [2048.1] [pid=14878]   Done.  Final output: '/usr/bin/printf "%b" "Date/Time: Tue Aug 4 16:55:27 BST 2020\nHost: test-host\nAddress: gateway.example.com\nService: Test\nState: CRITICAL\nInfo:\nCRITICAL: test-service service is not running - 04-08-2020 @ 16:53:59 BST" | /usr/bin/mailx -s "Service Test is CRITICAL on test-host" -r [email protected] [email protected]'
[Tue Aug  4 16:55:27 2020.224666] [2048.1] [pid=14878] **** END MACRO PROCESSING *************
[Tue Aug  4 16:55:27 2020.224670] [001.0] [pid=14878] process_macros_r()
[Tue Aug  4 16:55:27 2020.224674] [2048.1] [pid=14878] **** BEGIN MACRO PROCESSING ***********
[Tue Aug  4 16:55:27 2020.224676] [2048.1] [pid=14878] Processing: 'SERVICE NOTIFICATION: support;test-host;Test;CUSTOM ($SERVICESTATE$);service_notification;$SERVICEOUTPUT$;$NOTIFICATIONAUTHOR$;$NOTIFICATIONCOMMENT$
'
[Tue Aug  4 16:55:27 2020.224685] [2048.1] [pid=14878]   Done.  Final output: 'SERVICE NOTIFICATION: support;test-host;Test;CUSTOM (CRITICAL);service_notification;CRITICAL: test-service service is not running - 04-08-2020 @ 16:53:59 BST;user;Testing.
'
[Tue Aug  4 16:55:27 2020.224688] [2048.1] [pid=14878] **** END MACRO PROCESSING *************
[Tue Aug  4 16:55:27 2020.224695] [064.1] [pid=14878] Making callbacks (type 2)...
[Tue Aug  4 16:55:27 2020.224700] [16384.0] [pid=14878] create_job()
[Tue Aug  4 16:55:27 2020.224703] [16384.0] [pid=14878] get_worker()
[Tue Aug  4 16:55:27 2020.224708] [16384.1] [pid=14878] wproc_run_job()
[Tue Aug  4 16:55:27 2020.224713] [001.0] [pid=14878] macros_to_kvv()
[Tue Aug  4 16:55:27 2020.224787] [064.1] [pid=14878] Making callbacks (type 21)...
[Tue Aug  4 16:55:27 2020.224796] [064.1] [pid=14878] Making callbacks (type 20)...
[Tue Aug  4 16:55:27 2020.224801] [032.0] [pid=14878] 1 contacts were notified.
[Tue Aug  4 16:55:27 2020.224804] [064.1] [pid=14878] Making callbacks (type 5)...
[Tue Aug  4 16:55:27 2020.224808] [064.1] [pid=14878] Making callbacks (type 13)...
[Tue Aug  4 16:55:27 2020.224812] [064.1] [pid=14878] Making callbacks (type 17)...
[Tue Aug  4 16:55:27 2020.224817] [12288.1] [pid=14878] ## Polling 840ms; sockets=6; events=370; iobs=0x55ca662904b0