I just upgrade our test cluster from 2024R1.3.1 to 2024R1.3.2. Unfortunately the job scheduling from the command subsystem has gone "crazy".
The 5 jobs keeps resetten the 'next run time" to a fixed time/date in the past, which causes the jobs to run again immediately.
Nagios 2024R1.3.2
2 server (VM) cluster running Red Hat 9.5
I've tried the obvious stop elasticsearch on both nodes and then reboot both nodes. No change.
I've tried the Reset All jobs, but as soon as a job runs, it will set the next run time to the value of approximately the actual start time. Or so it seems.
I can edit a job to run much later which will stop it to run, but as soon as it runs it again will set the next run time to the value of approximately the actual start time causing it to go in a run loop.
First question, How can I disable the job scheduling manually all together for now? This to stop waisting resources for the moment.
What I've noticed for now:
- the 3 jobs cleanup_cmdsubsys, run_all_alerts and run_index_usage wil restart +/- every second.
- the 2 jobs backups and snapshots_maintenance will start a job every minute regardless if there's allready a process running
- the run_update_check job I can't figure out yet. At first it ran a few times per minute, but now it seems stuck in a running state.
Here's a screenshot of my situation, the backups and snapshots_maintenance jobs are rescheduled for in the future, they will eat up my system if they run every minute. The cleanup_cmdsubsys job I rescheduled to 04/10/2025 14:43:25, since then it runs every second. As you can see the run next time is stuck in the past and won't be updated correctly.
Hope you have a resolution, obviously I won't update the production cluster for now
Kind regards....Hans