Incorrect timestamp in duration for trap
Incorrect timestamp in duration for trap
Hi,
We are experiencing quite a strange issue on trap receiver of certain nodes. The common thing is that the traps originating from these nodes are all of the same vendor. However i could not see anything wrong when looking at the timestamp of the traps, but maybe I'm missing something.
When the trap hits the agent (Nagios) we end up seeing a very strange duration as below:
When we click on the service (i.e the trap service) we see that the duration is marked as "N/A".
But if we click on the service history we can see that all is fine for the date/time received.
What can we provide you in order to understand and correct this issue?
Rgds,
Matthew
We are experiencing quite a strange issue on trap receiver of certain nodes. The common thing is that the traps originating from these nodes are all of the same vendor. However i could not see anything wrong when looking at the timestamp of the traps, but maybe I'm missing something.
When the trap hits the agent (Nagios) we end up seeing a very strange duration as below:
When we click on the service (i.e the trap service) we see that the duration is marked as "N/A".
But if we click on the service history we can see that all is fine for the date/time received.
What can we provide you in order to understand and correct this issue?
Rgds,
Matthew
You do not have the required permissions to view the files attached to this post.
Re: Incorrect timestamp in duration for trap
The duration shows the time that the device has been in a particular state. I don't know that it is a valid field for an SNMP trap service check. An SNMP trap service check waits for any trap to be received from a specific device. They may not send the all clear after a critical trap has been thrown. In the service history screenshot, you are seeing the date/time that a trap came in.
All that said, I don't think you should be seeing 18,000 days under the duration field. Can you click on the Service Status Detail for the SNMP trap service, and click on the + icon to show the advanced details. Let me know if both Active and Passive checks are enabled.
All that said, I don't think you should be seeing 18,000 days under the duration field. Can you click on the Service Status Detail for the SNMP trap service, and click on the + icon to show the advanced details. Let me know if both Active and Passive checks are enabled.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Incorrect timestamp in duration for trap
Please PM me a copy of your profile, you can download it from Admin > System Profile > Download Profile button.
Additionally, please send the output of these commands (as root):
- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the first command if your DB is offloaded to another server and/or you've changed the root mysql password
Do you see any traps in /var/log/snmptt or in /var/log/messages for these? Can you send us an example so we can see what your traps are sending in so we can see why the timestamp isn't being honored?
What does it show in Admin > SNMP Trap Interface for received traps?
Please run this command an PM me the resulting /tmp/SNMPFILES.zip file:
Thank you
Additionally, please send the output of these commands (as root):
- NOTE: You may need to adjust the -h 127.0.0.1, the -uroot, and -pnagiosxi in the first command if your DB is offloaded to another server and/or you've changed the root mysql password
Code: Select all
echo "SELECT table_name AS 'Table', round(((data_length + index_length) / 1024 / 1024), 2) 'Size in MB' FROM information_schema.TABLES WHERE table_schema IN ('nagios', 'nagiosql', 'nagiosxi');" | mysql -h 127.0.0.1 -uroot -pnagiosxi --tableWhat does it show in Admin > SNMP Trap Interface for received traps?
Please run this command an PM me the resulting /tmp/SNMPFILES.zip file:
Code: Select all
zip -r /tmp/SNMPFILES.zip /etc/snmpRe: Incorrect timestamp in duration for trap
Hi
I think that is the way it should be since this is a passive check.
Profile sent as a PM. I have sent also the /etc/snmp/ content as a PM.
For the size of the tables, please find attached "tables.txt".
Traps in log /var/log/snmptt are clearly visible with correct timestamp:
As for "Admin > SNMP Trap Interface" i do not have anything configured nor anything in the "Received Traps" tab
Rgds,
Matthew
Active Checks are in state disabled, while passive checks are of course found enabled.Can you click on the Service Status Detail for the SNMP trap service, and click on the + icon to show the advanced details. Let me know if both Active and Passive checks are enabled.
I think that is the way it should be since this is a passive check.
Profile sent as a PM. I have sent also the /etc/snmp/ content as a PM.
For the size of the tables, please find attached "tables.txt".
Traps in log /var/log/snmptt are clearly visible with correct timestamp:
Code: Select all
grep am1-hss-master01-p snmptt.log-20200802 | grep -v heartbeat | grep "Jul 30"
Thu Jul 30 12:37:24 2020 .1.3.6.1.4.1.17856.3.1.2.0.1 Critical "Status Events" am1-hss-master01-p - A titanAlarmNotificaton represents a potential 07 E4 07 1E 0A 25 17 09 2B 00 00 1 management 170 element bru-sha-hss-hss01 Remote element [bru-sha-hss-hss01] has one or more raised alarms Navigate to the remote element and see its logs for more details
Thu Jul 30 12:37:24 2020 .1.3.6.1.4.1.17856.3.1.2.0.1 Normal "Status Events" am1-hss-master01-p - A titanAlarmNotificaton represents a potential 07 E4 07 1E 0A 25 17 09 2B 00 00 1 management 170 element bru-sha-hss-hss01 Remote element [bru-sha-hss-hss01] has one or more raised alarms Navigate to the remote element and see its logs for more details
Thu Jul 30 12:37:24 2020 .1.3.6.1.4.1.17856.3.1.2.0.1 Critical "Status Events" am1-hss-master01-p - A titanAlarmNotificaton represents a potential 07 E4 07 1E 0A 25 17 09 2B 00 00 5 management 170 element bru-sha-hss-hss01 Remote element [bru-sha-hss-hss01] has one or more raised alarms Navigate to the remote element and see its logs for more details
Thu Jul 30 12:42:24 2020 .1.3.6.1.4.1.17856.3.1.2.0.1 Critical "Status Events" am1-hss-master01-p - A titanAlarmNotificaton represents a potential 07 E4 07 1E 0A 2A 17 09 2B 00 00 1 management 170 element bru-sha-hss-hss01 Remote element [bru-sha-hss-hss01] has one or more raised alarms Navigate to the remote element and see its logs for more details
Thu Jul 30 12:42:24 2020 .1.3.6.1.4.1.17856.3.1.2.0.1 Normal "Status Events" am1-hss-master01-p - A titanAlarmNotificaton represents a potential 07 E4 07 1E 0A 2A 17 09 2B 00 00 1 management 170 element bru-sha-hss-hss01 Remote element [bru-sha-hss-hss01] has one or more raised alarms Navigate to the remote element and see its logs for more details
Thu Jul 30 12:42:24 2020 .1.3.6.1.4.1.17856.3.1.2.0.1 Critical "Status Events" am1-hss-master01-p - A titanAlarmNotificaton represents a potential 07 E4 07 1E 0A 2A 17 09 2B 00 00 4 management 170 element bru-sha-hss-hss01 Remote element [bru-sha-hss-hss01] has one or more raised alarms Navigate to the remote element and see its logs for more details
Thu Jul 30 12:42:24 2020 .1.3.6.1.4.1.17856.3.1.2.0.1 Warning "Status Events" am1-hss-master01-p - A titanAlarmNotificaton represents a potential 07 E4 07 1E 0A 2A 17 09 2B 00 00 4 management 170 element bru-sha-hss-hss01 Remote element [bru-sha-hss-hss01] has one or more raised alarms Navigate to the remote element and see its logs for more details
Rgds,
Matthew
Re: Incorrect timestamp in duration for trap
Please resend the tables.txt, I do not see it.
Please go to Reports > State History:
- Adjust the Period to like this month
- Select the host from the Limit To dropdown
- Select the service
- For Type, select Both
- For State Type, select Both
- Click Run
Please send me the report, you can either download it as a PDF or CSV.
I'm wondering if it's just been in WARNING the entire time. If you click on the service in Home > Service Detail and click the + (advanced) tab, please send a screenshot of that page so we can see what the values show.
Please go to Reports > State History:
- Adjust the Period to like this month
- Select the host from the Limit To dropdown
- Select the service
- For Type, select Both
- For State Type, select Both
- Click Run
Please send me the report, you can either download it as a PDF or CSV.
I'm wondering if it's just been in WARNING the entire time. If you click on the service in Home > Service Detail and click the + (advanced) tab, please send a screenshot of that page so we can see what the values show.
Re: Incorrect timestamp in duration for trap
Hi,
Please find the tables, state history report, and service detail screenshot attached.
The warning is set as is since I had configured the snmptt.conf as a warning depending on the SNMP variables matches. See below.
Rgds,
Matthew
Please find the tables, state history report, and service detail screenshot attached.
The warning is set as is since I had configured the snmptt.conf as a warning depending on the SNMP variables matches. See below.
Code: Select all
EVENT titanAlarmNotification .1.3.6.1.4.1.17856.3.1.2.0.1 "Status Events" Warning
FORMAT A titanAlarmNotificaton represents a potential $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps Management" "$s" " " "RepairAction: $8 NOTE: Actual time of node is in UTC!" "SNMPTrap: WARNING: ProbableCause: $6"
MATCH $2: 2-4
MATCH $3: (management)
MATCH MODE=and
SDESC
A titanAlarmNotificaton represents a potential
or actual service affecting condition that is
detected by the application. This trap signifies
the occurrence of an action and/or condition that
is significant to administrative users of the
system. When the condition is resolved, an
identical trap with a severity of CLEAR is sent.
Variables:
1: titanAlarmTimestamp
2: titanAlarmSeverity
3: titanAlarmSubsystem
4: titanAlarmId
5: titanAlarmResource
6: titanAlarmProbableCause
7: titanAlarmAdditionalText
8: titanAlarmRepairAction
EDESC
#
EVENT titanAlarmNotification .1.3.6.1.4.1.17856.3.1.2.0.1 "Status Events" Normal
FORMAT A titanAlarmNotificaton represents a potential $*
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps Management" "$s" " " "RepairAction: $8 NOTE: Actual time of node is in UTC!" "SNMPTrap: CLEARED: ProbableCause: $6"
MATCH $2: 1
MATCH $3: (management)
MATCH MODE=and
SDESC
A titanAlarmNotificaton represents a potential
or actual service affecting condition that is
detected by the application. This trap signifies
the occurrence of an action and/or condition that
is significant to administrative users of the
system. When the condition is resolved, an
identical trap with a severity of CLEAR is sent.
Variables:
1: titanAlarmTimestamp
2: titanAlarmSeverity
3: titanAlarmSubsystem
4: titanAlarmId
5: titanAlarmResource
6: titanAlarmProbableCause
7: titanAlarmAdditionalText
8: titanAlarmRepairAction
EDESC
#
Matthew
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Incorrect timestamp in duration for trap
Hi Mathew,
The database tables look ok and the state of the service did change so I want to confirm if this is a database/php issue or not. Let's log into the Nagios Core interface on this server and check the duration values to verify any discrepancy.
Login into to http://ipaddress/nagios, click on Services on the left-hand side, click on the trap service and take look at the Service State Information table, does it match what you're seeing in the XI interface?
Also, can you send over a fresh system profile? I would like to view the current logs. Thanks, Benjamin
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.
The database tables look ok and the state of the service did change so I want to confirm if this is a database/php issue or not. Let's log into the Nagios Core interface on this server and check the duration values to verify any discrepancy.
Login into to http://ipaddress/nagios, click on Services on the left-hand side, click on the trap service and take look at the Service State Information table, does it match what you're seeing in the XI interface?
Also, can you send over a fresh system profile? I would like to view the current logs. Thanks, Benjamin
To send us your system profile.
Login to the Nagios XI GUI using a web browser.
Click the "Admin" > "System Profile" Menu
Click the "Download Profile" button
Save the profile.zip file and share in a private message or upload it to the post/ticket, and then reply to this post to bring it up in the queue.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Incorrect timestamp in duration for trap
Hi Benjamin,
Indeed from core, it's the 1970 timestamp.
Attached is also a fresh profile (PM).
Rgds,
Indeed from core, it's the 1970 timestamp.
Attached is also a fresh profile (PM).
Rgds,
You do not have the required permissions to view the files attached to this post.
-
benjaminsmith
- Posts: 5324
- Joined: Wed Aug 22, 2018 4:39 pm
- Location: saint paul
Re: Incorrect timestamp in duration for trap
Hi,
Thanks for verifying that information in Nagios Core, it looks to be a configuration issue. Let's edit the snmptt.conf /etc/snmp/snmptt.conf
Change this line from
To:
Save the change and restart snmptt
Let me know if the issue is resolved. You may have to let it run for a few minutes.
$@ - Number of seconds since the epoch of when the trap was spooled (daemon mode) or the current time (standalone mode)
Thanks for verifying that information in Nagios Core, it looks to be a configuration issue. Let's edit the snmptt.conf /etc/snmp/snmptt.conf
Change this line from
Code: Select all
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps Management" "$s" " " "RepairAction: $8 NOTE: Actual time of node is in UTC!" "SNMPTrap: WARNING: ProbableCause: $6"Code: Select all
EXEC /usr/local/bin/snmptraphandling.py "$r" "SNMP Traps Management" "$s" "$@" "$-*" "RepairAction: $8 NOTE: Actual time of node is in UTC! SNMPTrap: WARNING: ProbableCause: $6"Code: Select all
systemctl restart snmptt
$@ - Number of seconds since the epoch of when the trap was spooled (daemon mode) or the current time (standalone mode)
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Be sure to check out our Knowledgebase for helpful articles and solutions!
Be sure to check out our Knowledgebase for helpful articles and solutions!
Re: Incorrect timestamp in duration for trap
Hi Benjamin,
That was indeed the issue!
After the change, I noticed the correct timestamp entry.
This ticket can be closed.
Thanks!
Matthew
That was indeed the issue!
After the change, I noticed the correct timestamp entry.
This ticket can be closed.
Thanks!
Matthew