Flap and Retain status issues with the service

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Flap and Retain status issues with the service

Post by dlukinski »

mcapra wrote:I would need to see the full call trace from the RC server to determine if it's a "false negative" or not. I know check_selenium by itself will throw a "CRITICAL" if it finds the word "ERROR" anywhere in the test case's output, though I believe the RC server will re-try the test if it can't establish a session ID on the first attempt in many cases. So the first session on these tests might be failing, but a second/third may be succeeding. The logic in check_selenium i'm referring to:

Code: Select all

my $output = `perl $script 2>&1`;
if ( $output =~ m/(ERROR.+\n)/ ) {
    $message = $1;
    $rc = 2; #end it with an error
} elsif ( $output =~ m/OK/ )  {
    my @lines = split(/\n/, $output);
    foreach my $line (@lines) {
        if ( $line =~ m/OK:/ ) {
            ($message, $rc, $performance_msg, $items_ran) = split(/\|/, $line);
            $message = "$message | $performance_msg\n";
        }
    }
} else {
    $message = "UNKNOWN: $output";
    $rc = 3;
}
You might try altering this script to set $message equal to $output like so to get better debug information:

Code: Select all

$message = $output;
Though this could cause issues with the status output overflowing as @avandemore pointed out.

Suggested output change makes ALL Selenium script fail with Out of bond 255 error.

Version 5.4.2 upgrade made no difference to the RC "ERROR Server Exception" as attached (script does finish anyways and definitely OK on the re-try) / is there a way not to produce FAIL for "ERROR Server Exception" somehow? - Even the UNKNOWN will do just fine

Thank you
You do not have the required permissions to view the files attached to this post.
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Flap and Retain status issues with the service

Post by mcapra »

You could try adding a separate regex match to check_selenium for Jetty (the web server component used by the RC server) server exceptions:

Code: Select all

my $output = `perl $script 2>&1`;
if ( $output =~ m/(ERROR Server Exception.+\n)/ ) {
    $message = $1;
    $rc = 3; #end it with an unknown
}
elsif ( $output =~ m/(ERROR.+\n)/ ) {
    $message = $1;
    $rc = 2; #end it with an error
} elsif ( $output =~ m/OK/ )  {
    my @lines = split(/\n/, $output);
    foreach my $line (@lines) {
        if ( $line =~ m/OK:/ ) {
            ($message, $rc, $performance_msg, $items_ran) = split(/\|/, $line);
            $message = "$message | $performance_msg\n";
        }
    }
} else {
    $message = "UNKNOWN: $output";
    $rc = 3;
}
Though the call stack you sent a screenshot of is incomplete. I'd need to see the whole thing to be able to say why Jetty is crashing.
Former Nagios employee
https://www.mcapra.com/
dlukinski
Posts: 1130
Joined: Tue Oct 06, 2015 9:42 am

Re: Flap and Retain status issues with the service

Post by dlukinski »

mcapra wrote:You could try adding a separate regex match to check_selenium for Jetty (the web server component used by the RC server) server exceptions:

Code: Select all

my $output = `perl $script 2>&1`;
if ( $output =~ m/(ERROR Server Exception.+\n)/ ) {
    $message = $1;
    $rc = 3; #end it with an unknown
}
elsif ( $output =~ m/(ERROR.+\n)/ ) {
    $message = $1;
    $rc = 2; #end it with an error
} elsif ( $output =~ m/OK/ )  {
    my @lines = split(/\n/, $output);
    foreach my $line (@lines) {
        if ( $line =~ m/OK:/ ) {
            ($message, $rc, $performance_msg, $items_ran) = split(/\|/, $line);
            $message = "$message | $performance_msg\n";
        }
    }
} else {
    $message = "UNKNOWN: $output";
    $rc = 3;
}
Though the call stack you sent a screenshot of is incomplete. I'd need to see the whole thing to be able to say why Jetty is crashing.
This worked.

Thank you, please close this thread.
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Flap and Retain status issues with the service

Post by cdienger »

Glad we were able to help resolve the problem. We'll close the thread.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked