Mysql replication Error

This support forum board is for support questions relating to Nagios XI, our flagship commercial network monitoring solution.
Locked
Johnsmit
Posts: 95
Joined: Thu Apr 19, 2018 2:03 pm

Mysql replication Error

Post by Johnsmit »

Hi,

Any Mysql help will be appreciated, i am trying to replicate data to another Nagios Xi instance, but at the slave am getting this error.

show slave status \G;
*************************** 1. row ***************************
Slave_IO_State: Waiting to reconnect after a failed master event read
Master_Host: server _name1
Master_User: maost
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000001
Read_Master_Log_Pos: 9800872
Relay_Log_File: mariadb-relay-bin.000001
Relay_Log_Pos: 4
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Connecting
Slave_SQL_Running: Yes
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 0
Last_Error:
Skip_Counter: 0
Exec_Master_Log_Pos: 9800872
Relay_Log_Space: 245
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 0
Last_SQL_Error:
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Mysql replication Error

Post by cdienger »

The bit of research I did seems to indicate a network issue:

https://forums.mysql.com/read.php?26,373524,373524
https://forums.mysql.com/read.php?26,617664,617664

Are there any firewalls between the machines? Are there any iptables rules on either machine that would prevent the connection(check with 'iptables -L'). Does a telnet test from the slave to master work? Did you create/configure a user for replication per https://dev.mysql.com/doc/refman/8.0/en ... puser.html ?
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Johnsmit
Posts: 95
Joined: Thu Apr 19, 2018 2:03 pm

Re: Mysql replication Error

Post by Johnsmit »

Thanks for the response, I tried and succeeded replication using the links you provided.
I made changes on master server after few minutes slave ran into duplicate entry errors.

Slave_IO_State: Waiting for master to send event
Master_Host: Master_server
Master_User: maost
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mysql-bin.000003
Read_Master_Log_Pos: 1679588
Relay_Log_File: mariadb-relay-bin.000003
Relay_Log_Pos: 3152825
Relay_Master_Log_File: mysql-bin.000001
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB:
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1062
Last_Error: Error 'Duplicate entry '277' for key 'PRIMARY'' on query. Default database: 'nagiosql'. Query: 'INSERT INTO `tbl_logbook` SET `user`='nagiosadmin',`time`=NOW(), `ipadress`='local_desktop', `domain`='localhost', `entry`='Service modified: remote_server''
Skip_Counter: 0
Exec_Master_Log_Pos: 3528993
Relay_Log_Space: 27065793
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1062
Last_SQL_Error: Error 'Duplicate entry '277' for key 'PRIMARY'' on query. Default database: 'nagiosql'. Query: 'INSERT INTO `tbl_logbook` SET `user`='nagiosadmin',`time`=NOW(), `ipadress`='local_desktop', `domain`='localhost', `entry`='Service modified: remote_server''
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)


Thanks,
rkennedy
Posts: 6579
Joined: Mon Oct 05, 2015 11:45 am

Re: Mysql replication Error

Post by rkennedy »

Would you mind elaborating on your setup? I'm interested to hear how you've pieced this together for redundancy.
Former Nagios Employee
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Mysql replication Error

Post by cdienger »

@johnsmit try running the repair script first to on the master and then replicate. https://assets.nagios.com/downloads/nag ... tabase.pdf covers repairing the db.
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Johnsmit
Posts: 95
Joined: Thu Apr 19, 2018 2:03 pm

Re: Mysql replication Error

Post by Johnsmit »

rkennedy wrote:Would you mind elaborating on your setup? I'm interested to hear how you've pieced this together for redundancy.
Hello,
I have followed this url to do master-slave replication on nagios xi servers.
https://forums.mysql.com/read.php?26,171776,205870
https://tunnelix.com/simple-master-mast ... n-mariadb/

on master:
cat /etc/my.cnf
server-id=1
[mysqld]
log-bin
binlog-do-db= mysql
binlog-do-db= information_schema
binlog-do-db= nagios
binlog-do-db= nagiosql
binlog-do-db= nagiosxi
binlog-do-db= performance_schema
binlog-do-db= test
auto_increment_increment = 5
auto_increment_offset = 1


on slave:
cat /etc/my.cnf
[mysqld]
server-id= 20
replicate-do-db=mysql
replicate-do-db=nagios
replicate-do-db=nagiosql
replicate-do-db=nagiosxi
replicate-do-db=test
replicate-do-db=information_schema
replicate-do-db=performance_schema
auto_increment_increment = 5
auto_increment_offset = 1


show slave status\G;
*************************** 1. row ***************************
Slave_IO_State: Waiting for master to send event
Master_Host: master_server
Master_User: maost
Master_Port: 3306
Connect_Retry: 60
Master_Log_File: mariadb-bin.000003
Read_Master_Log_Pos: 287945400
Relay_Log_File: mariadb-relay-bin.000002
Relay_Log_Pos: 554345
Relay_Master_Log_File: mariadb-bin.000003
Slave_IO_Running: Yes
Slave_SQL_Running: No
Replicate_Do_DB: mysql,nagios,nagiosql,nagiosxi,test,information_schema,performance_schema
Replicate_Ignore_DB:
Replicate_Do_Table:
Replicate_Ignore_Table:
Replicate_Wild_Do_Table:
Replicate_Wild_Ignore_Table:
Last_Errno: 1062
Last_Error: Error 'Duplicate entry '1576' for key 'PRIMARY'' on query. Default database: 'nagiosxi'. Query: 'INSERT INTO xi_auditlog (log_time,source,user,type,message,ip_address,details) VALUES ('2018-05-15 16:34:19','Nagios XI','NULL',32,'cmdsubsys: User [nagiosadmin] applied a new configuration to Nagios Core','localhost','')'
Skip_Counter: 0
Exec_Master_Log_Pos: 1311710
Relay_Log_Space: 287188331
Until_Condition: None
Until_Log_File:
Until_Log_Pos: 0
Master_SSL_Allowed: No
Master_SSL_CA_File:
Master_SSL_CA_Path:
Master_SSL_Cert:
Master_SSL_Cipher:
Master_SSL_Key:
Seconds_Behind_Master: NULL
Master_SSL_Verify_Server_Cert: No
Last_IO_Errno: 0
Last_IO_Error:
Last_SQL_Errno: 1062
Last_SQL_Error: Error 'Duplicate entry '1576' for key 'PRIMARY'' on query. Default database: 'nagiosxi'. Query: 'INSERT INTO xi_auditlog (log_time,source,user,type,message,ip_address,details) VALUES ('2018-05-15 16:34:19','Nagios XI','NULL',32,'cmdsubsys: User [nagiosadmin] applied a new configuration to Nagios Core','localhost','')'
Replicate_Ignore_Server_Ids:
Master_Server_Id: 1
1 row in set (0.00 sec)

ERROR: No query specified


the master is not able to overwrite the slave values in nagiosql,nagiosxi.
I tried with the default mysql database it can replicate without any errors when i tried on all the databases of master server, i ran into a lot of duplicate errors.

On Slave
cat /var/log/messages
May 15 13:12:28 server_ip ndo2db: Error: mysql_query() failed for 'INSERT INTO nagios_servicestatus SET instance_id='1', service_object_id='195', status_update_time=FROM_UNIXTIME(1526404348), output='● npcd\.service - SYSV: Visit the Website at http://sourceforge\.net/projects/pnp4nagios/', long_output=' Loaded: loaded \(/etc/rc\.d/init\.d/npcd; bad; vendor preset: disabled\)\\n Active: active \(running\) since Tue 2018-05-15 13:04:15 EDT; 8min ago\\n Docs: man:systemd-sysv-generator\(8\)\\n Process: 22787 ExecStop=/etc/rc\.d/init\.d/npcd stop \(code=exited, status=0/SUCCESS\)\\n Process: 23104 ExecStart=/etc/rc\.d/init\.d/npcd start \(code=exited, status=0/SUCCESS\)\\n Main PID: 23110 \(npcd\)\\n CGroup: /system\.slice/npcd\.service\\n └─23110 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd\.cfg\\n\\nMay 15 13:04:15 server_ip\.ba\.ssa\.gov systemd\[1\]: Starting SYSV: Visit the Website at http://sourceforge\.net/projects/pnp4nagios/\.\.\.\\nMay 15 13:04:15 server_ip\.ba\.ssa\.gov npcd\[23104\]: NPCD started\.\\nMay 15 13:04:15 server_ip\.ba\.ssa\.gov systemd\[1\]: Failed to read PID from file /usr/local/nagiosxi/var/subsys/npcd\.pid: Invalid argument\\nMay 15 13:04:15 server_ip\.ba\.ssa\.gov systemd\[1\]: Started SYSV: Visit the Website at http://sourceforge\.net/projects/pnp4nagios/\.', perfdata='', current_state='0', has_been_checked='1', should_be_scheduled='1', current_check_attempt='1', max_check_attempts='4', last_check=FROM_UNIXTIME(1526404348), next_check=FROM_UNIXTIME(1526404648), check_type='0', last_state_change=FROM_UNIXTIME(1525707890), last_hard_state_change=FROM_UNIXTIME(1525707890), last_hard_state='0', last_time_ok=FROM_UNIXTIME(1526404348), last_time_warning=FROM_UNIXTIME(0), last_time_unknown=FROM_UNIXTIME(0), last_time_critical=FROM_UNIXTIME(0), state_type='1', last_notification=FROM_UNIXTIME(0), next_notification=FROM_UNIXTIME(0), no_more_notifications='0', notifications_enabled='1', problem_has_been_acknowledged='0', acknowledgement_type='0', current_notification_number='0', passive_checks_enabled='1', active_checks_enabled='1', event_handler_enabled='1', flap_detection_enabled='1', is_flapping='0', percent_state_change='0.000000', latency='0.000000', execution_time='0.024730', scheduled_downtime_depth='0', failure_prediction_enabled='0', process_performance_data='1', obsess_over_service='1', modified_service_attributes='0', event_handler='', check_command='check_xi_service_status!npcd!!!!!!!', normal_check_interval='5.000000', retry_check_interval='1.000000', check_timeperiod_object_id='125' ON DUPLICATE KEY UPDATE instance_id='1', service_object_id='195', status_update_time=FROM_UNIXTIME(1526404348), output='● npcd\.service - SYSV: Visit the Website at http://sourceforge\.net/projects/pnp4nagios/', long_output=' Loaded: loaded \(/etc/rc\.d/init\.d/npcd; bad; vendor preset: disabled\)\\n Active: active \(running\) since Tue 2018-05-15 13:04:15 EDT; 8min ago\\n Docs: man:systemd-sysv-generator\(8\)\\n Process: 22787 ExecStop=/etc/rc\.d/init\.d/npcd stop \(code=exited, status=0/SUCCESS\)\\n Process: 23104 ExecStart=/etc/rc\.d/init\.d/npcd start \(code=exited, status=0/SUCCESS\)\\n Main PID: 23110 \(npcd\)\\n CGroup: /system\.slice/npcd\.service\\n └─23110 /usr/local/nagios/bin/npcd -d -f /usr/local/nagios/etc/pnp/npcd\.cfg\\n\\nMay 15 13:04:15 server_ip\.ba\.ssa\.gov systemd\[1\]: Starting SYSV: Visit the Website at http://sourceforge\.net/projects/pnp4nagios/\.\.\.\\nMay 15 13:04:15 server_ip\.ba\.ssa\.gov npcd\[23104\]: NPCD started\.\\nMay 15 13:04:15 server_ip\.ba\.ssa\.gov systemd\[1\]: Failed to read PID from file /usr/local/nagiosxi/var/subsys/npcd\.pid: Invalid argument\\nMay 15 13:04:15 server_ip\.ba\.ssa\.gov systemd\[1\]: Started SYSV: Visit the Website at http://sourceforge\.net/projects/pnp4nagios/\.', perfdata='', current_state='0', has_been_checked='1', should_be_scheduled='1', current_check_attempt='1', max_check_attempts='4', last_check=FROM_UNIXTIME(1526404348), next_check=FROM_UNIXTIME(1526404648), check_type='0', last_state_change=FROM_UNIXTIME(1525707890), last_hard_state_change=FROM_UNIXTIME(1525707890), last_hard_state='0', last_time_ok=FROM_UNIXTIME(1526404348), last_time_warning=FROM_UNIXTIME(0), last_time_unknown=FROM_UNIXTIME(0), last_time_critical=FROM_UNIXTIME(0), state_type='1', last_notification=FROM_UNIXTIME(0), next_notification=FROM_UNIXTIME(0), no_more_notifications='0', notifications_enabled='1', problem_has_been_acknowledged='0', acknowledgement_type='0', current_notification_number='0', passive_checks_enabled='1', active_checks_enabled='1', event_handler_enabled='1', flap_detection_enabled='1', is_flapping='0', percent_state_change='0.000000', latency='0.000000', execution_time='0.024730', scheduled_downtime_depth='0', failure_prediction_enabled='0', process_performance_data='1', obsess_over_service='1', modified_service_attributes='0', event_handler='', check_command='check_xi_service_status!npcd!!!!!!!', normal_check_interval='5.000000', retry_check_interval='1.000000', check_timeperiod_object_id='125''
May 15 13:12:28 server_ip ndo2db: mysql_error: 'MySQL server has gone away'
May 15 13:12:28 server_ip ndo2db: Error: Connection to MySQL database has been lost!
May 15 13:12:28 server_ip ndo2db: Successfully disconnected from MySQL database
May 15 13:12:28 server_ip ndo2db: Successfully connected to MySQL database



On Master:
Please see the attached document for errors in master.


Thanks,
You do not have the required permissions to view the files attached to this post.
Johnsmit
Posts: 95
Joined: Thu Apr 19, 2018 2:03 pm

Re: Mysql replication Error

Post by Johnsmit »

cdienger wrote:@johnsmit try running the repair script first to on the master and then replicate. https://assets.nagios.com/downloads/nag ... tabase.pdf covers repairing the db.
Hi,

I tried with the repair database script, it repaired databases without any errors and tried replication again, ran in to a lot of duplicate errors on both master and slave machines.

Can you provide me any source or document that helps me,

Thanks,
User avatar
mcapra
Posts: 3739
Joined: Thu May 05, 2016 3:54 pm

Re: Mysql replication Error

Post by mcapra »

This is well beyond the realm of Nagios XI related problems and has only drifted farther towards the specifics of MySQL replication.

In this particular master/slave setup, is there a reason you're using STATEMENT based replication? I don't see a particular benefit of the added overhead and you appear to have consensus issues due to PK restrictions on both databases. I would suggest using a different replication strategy.

Are you certain you correctly emptied the slave prior to running the database repair script on the master? Remember, there are internal processes within Nagios XI that will continuously write to it's database and those need to be disabled to make sure that the slave stays clean until replication is complete.

Additionally, when using STATEMENT based replication, I'm fairly certain the subsystems within Nagios XI are going to cause consensus issues within your setup. You should at a minimum have these subsystems disabled on the current "slave" node. Here's the full architectural overview of Nagios XI, which should be useful in informing the replication strategy you choose:
https://support.nagios.com/kb/category.php?id=47

And as a reminder, this is absolutely not documented anywhere at all with simple "step by step" instructions. All of this is based on vague knowledge I have of how both Nagios XI and MySQL work.
Former Nagios employee
https://www.mcapra.com/
User avatar
cdienger
Support Tech
Posts: 5045
Joined: Tue Feb 07, 2017 11:26 am

Re: Mysql replication Error

Post by cdienger »

Thanks for the assist, @mcapra!
As of May 25th, 2018, all communications with Nagios Enterprises and its employees are covered under our new Privacy Policy.
Locked