Page 1 of 1

NagiosXI help with custom monitoring services for Win10 VM's

Posted: Tue Apr 05, 2022 1:50 pm
by e5639064
  • Hello everyone, it's my first post ever and I can tell I got like 2 weeks experience in this Nagios saga. I have been recommended to try Nagios as solution for my probably not so weird problem for you guys. :)

    To get to the point, idea is to set monitoring for about ten windows 10 virtual machines to check as often as possible who is currently logged into them and to see exactly their usernames, same as you would use "w" command in terminal on Linux OS or even "query user" command in cmd. It would be ideal to have output something like: (status | number of users | name of users)
    -where status should be Nagios exit codes ("ok" if there is 1 user, "warning" if there are 2 users and "critical" if there are 3 and more users)
  • My setup and progress so far are that I have 3 virtual machines in Hyper-V, one is Nagios XI server (Centos 7) and two are standard Windows 10 virtual machines to test on them(one for active check and other for passive). With little research I found out that this type and output of checks does not exist as "default" when you set ncpa for clients, so I had to find out how to manually add custom checks (services) and I made I guess not so good bash script on location /usr/local/nagios/libexec/ to get what I need:

    Code: Select all

    #!/bin/bash
    
    mLine1=`w | awk 'NR==1'`
    mLine2=`w | awk 'NR==2'`
    mLine3=`w | awk 'NR==3'`
    
    mUsers=`echo $mLine1 | awk '{print $4}'`
    mUser1=`echo $mLine3 | awk '{print $1}'`
    mUser2=`echo $mLine4 | awk '{print $1}'`
    mUser3=`echo $mLine5 | awk '{print $1}'`
    mUser4=`echo $mLine6 | awk '{print $1}'`
    mUser5=`echo $mLine7 | awk '{print $1}'`
    
    nagios_func(){
      echo "$1 | Number of users= $mUsers | Users: $mUser1 , $mUser2 , $mUser3 , $mUser4 , $mUser5";
    }
    
    if [[ $mUsers -gt 2 ]]
    then
      nagios_func "CRITICAL"; exit 2;
    elif [[ $mUsers -gt 1 ]]
    then
      nagios_func "WARNING"; exit 1;
    else
      nagios_func "OK"; exit 0;
    fi
    and output:
    OK | Number of users= 1 | Users: root , , , ,
  • Next thing at Manage Plugin tab I have my newly made plugin:
    File | Owner | Group | Permissions | Date
    check_myUsers | nagios | nagios | rwxrwxr-x | 2022-03-31 15:47:17
  • Now here goes part that I think I need your help to manage/change something if I'm on right path. Next thing that I guess I have problem is command on Core Config Manager and >_Commands tab, there i made command named "check_myUsers" and command line part that value is "$USER1$/check_myUsers"
  • At next step I made service where check command is "check_myUsers" but when I run check command, I got same output as it is when I test script on server instead of getting active user on my windows machine which now is "admin":
    [nagios@localhost.localdomain ~]$ /usr/local/nagios/libexec/check_myUsers
    OK | Number of users= 1 | Users: root , , , ,

    I guess there is problem with command part so I don't have result from my clients, and I would be so glad to get some tips/hints from You guys how to fix this issue. :) :) :)
    Thank you all and please forgive if this is not right form of writing ticket.

Re: NagiosXI help with custom monitoring services for Win10

Posted: Wed Apr 06, 2022 6:37 pm
by gormank

Re: NagiosXI help with custom monitoring services for Win10

Posted: Thu Apr 07, 2022 7:57 am
by e5639064
Update #1

Hello guys, I made new progress with this topic so I think I should share with you.
If we forget my custom script and focus on using check_ncpa.py I found out that there are some options to check for user count and list:

Code: Select all

{
    "user": {
        "count": [
            2,
            "users"
        ],
        "list": [
            [
                "user1",
                "user2"
            ],
            "users"
        ]
    }
}
So I decided to try with that stuff, and I have 2 questions:
1) Is it possible to have only one check(service) that will return values from both count and list?
2) And can someone help me with editing my command for UsersCount so "warning and critical" part can work?

So far I have 2 commands for 2 services that are working right now:
check_myUsersList
Command View:
  • $USER1$/check_ncpa.py -H $HOSTADDRESS$ -t 'windows1' -P 5693 $ARG1$
$ARG1$:
  • "-M user/list"
This command returns:
OK: List was ['Admin', 'user1']
check_myUsersCount
Command View:
  • "$USER1$/check_ncpa.py -H $HOSTADDRESS$ -t 'windows1' -P 5693 $ARG1$"
$ARG1$:
  • "-M user/count"
But when I run check command for this one I got:
OK: Count was 2 users | 'count'=2;;;

Problem is that it has returncode 0 and status "OK" while 2 users are logged in at the moment (it should be "Warning")
I have tried adding $ARG2$ like this:
"$USER1$/check_ncpa.py -H $HOSTADDRESS$ -t 'windows1' -P 5693 $ARG1$ $ARG2$"
and setting different values for ARG2
  • -w 2 -c 3
  • -warning=2 -critical=3
but I keep getting this output:
OK: Count was 2 users | 'count'=2;2;3;

If anyone has any ideas with this stuff i would be so gratefull, thank you all and cheers!