MHA high availability configuration and failover

Problems in traditional MySQL master-slave architecture

Single point failure

  • The main server is down, resulting in the failure of data writing

MHA

MHA overview

  • A set of excellent high availability software for failover and master-slave promotion in mysql high availability environment

Composition of MHA

  • MHA Manager (management node)
  • MHA Node

MHA characteristics

  • In the process of automatic switching, MHA tries to save binary logs from the down master server to ensure the data is not lost to the greatest extent
  • Using MySQL version 5.5 synchronous replication can greatly reduce the risk of data loss

MHA working process

  • Save binary log events from the down mysql master server
  • Identify the slave server with the latest updates
  • Apply differential relay log to other slave servers
  • Binary log events applied to the master server
  • Promote the standby primary server set in the slave server to the primary server
  • Connect other slave servers to new master servers for replication

Case implementation

1. Experimental environment

host name IP address
MHA-manager 192.168.150.181
Master (master server) 192.168.150.240
slave1 (slave) 192.168.150.158
Slave 2 (slave server) 192.168.150.244
Client (test) 192.168.150.243

2. Build three MySQL servers (mysql-5.6 is used in this experiment)

  • Create two soft connections on mysql server after building
[root@manager ~]# ln -s /usr/local/mysql/bin/mysql /usr/sbin/
[root@manager ~]# ln -s /usr/local/mysql/bin/mysqlbinlog /usr/sbin/

3. Build mysql master-slave replication environment
(1) Modify master server master profile

[root@master ~]# vim /etc/my.cnf
#Add the following directly
server-id = 1
log_bin = master-bin
log-slave-updates = true

#Restart service
[root@master ~]# service mysqld restart
Shutting down MySQL.. SUCCESS! 
Starting MySQL. SUCCESS!

(2) Modify the master configuration file of the slave server

  • Change slave1
[root@slave1 ~]# vim /etc/my.cnf
#Add the following directly
server-id = 2
log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index

#Restart service
[root@slave1 ~]# service mysqld restart
Shutting down MySQL.. SUCCESS! 
Starting MySQL. SUCCESS!
  • Change slave2
[root@slave2 ~]# vim /etc/my.cnf
#Add the following directly
server-id = 3
log_bin = master-bin
relay-log = relay-log-bin
relay-log-index = slave-relay-bin.index

#Restart service
[root@slave2 ~]# service mysqld restart
Shutting down MySQL.. SUCCESS! 
Starting MySQL. SUCCESS!

(3) Create master slave sync user myslave

mysql> grant replication slave on *.* to 'myslave'@'192.168.150.%' identified by '123';
Query OK, 0 rows affected (0.00 sec)

(4) Additional adjustments

  • All database authorized mha users to operate on the database
mysql> grant all privileges on *.* to 'mha'@'192.168.150.%' identified by 'manager';
Query OK, 0 rows affected (0.00 sec)

mysql> grant all privileges on *.* to 'mha'@'master' identified by 'manager';
Query OK, 0 rows affected (0.00 sec)

mysql> grant all privileges on *.* to 'mha'@'slave1' identified by 'manager';
Query OK, 0 rows affected (0.00 sec)

mysql> grant all privileges on *.* to 'mha'@'slave2' identified by 'manager';
Query OK, 0 rows affected (0.00 sec)
#Refresh database
mysql> flush privileges;
Query OK, 0 rows affected (0.00 sec)

(5) View as forbidden file and synchronization point on mysql master server
(6) Synchronize from server

mysql> change master to master_host='192.168.150.240',master_user='myslave',master_password='123',master_log_file='master-bin.000002',master_log_pos=12215;
Query OK, 0 rows affected, 2 warnings (0.00 sec)

mysql> start slave;
Query OK, 0 rows affected (0.01 sec)

mysql> show slave status\G;
···Omit part of the content
             Slave_IO_Running: Yes		//If this is slave IO running: connecting, check whether the firewall of the main server is closed
            Slave_SQL_Running: Yes
···Omit part of the content

(7) Set two slave servers to read-only mode

mysql> set global read_only=1;
Query OK, 0 rows affected (0.00 sec)

4. Install Node components (all servers, version 0.57)
(1) Installation basic environment

[root@master ~]# yum install epel-release --nogpgcheck -y
[root@master ~]# yum install -y perl-DBD-MySQL \
> perl-Config-Tiny \
> perl-Log-Dispatch \
> perl-Parallel-ForkManager \
> perl-ExtUtils-CBuilder \
> perl-ExtUtils-MakeMaker \
> perl-CPAN

(2) Install Node components

[root@master ~]# tar zxvf /abc/mha/mha4mysql-node-0.57.tar.gz
[root@master ~]# cd mha4mysql-node-0.57/
[root@master mha4mysql-node-0.57]# perl Makefile.PL
[root@master mha4mysql-node-0.57]# make
[root@master mha4mysql-node-0.57]# make install

After node is installed, several script tools are generated under / usr/local/bin /. (these tools are usually triggered by MHA Manager script without human operation)

  • Save? Binary? Logs: save and copy the binary files of the master
  • Apply? Diff? Relay? Logs: identify different relay log events and apply them to other slave s
  • filter_mysqlbinlog: remove unnecessary ROLLBACK events (MHA no longer uses this tool)
  • Purge? Relay? Logs: clear relay logs (does not block SQL threads)

5. Install the manager component (on the manager node, version 0.57)

[root@manager ~]# tar zxvf /abc/mha/mha4mysql-manager-0.57.tar.gz
[root@manager ~]# cd mha4mysql-manager-0.57/
[root@manager mha4mysql-manager-0.57]# perl Makefile.PL
[root@manager mha4mysql-manager-0.57]# make
[root@manager mha4mysql-manager-0.57]# make install

After manager is installed, several script tools will be generated under / usr/local/bin /

  • Master? Check? SSH: check the SSH configuration of MHA
  • Master check repl: check MySQL replication status
  • Master a & U manager: script to start manager
  • Master? Check? Status: detects the current MHA operation status
  • Master monitor: check whether the master is down
  • Master? Master? Switch: control failover (automatic or manual)
  • Master? Conf? Host: add or remove configuration server information
  • Master a? Stop: close manager

6. Configure password free authentication

  • Tools: SSH keygen, SSH copy ID

(1) Configure password free authentication for all data nodes on manager

[root@manager ~]# ssh-keygen -t rsa
#Keep going back
[root@manager ~]# ssh-copy-id 192.168.150.240
[root@manager ~]# ssh-copy-id 192.168.150.158
[root@manager ~]# ssh-copy-id 192.168.150.244

(2) Configure password free authentication to database nodes slave1 and slave2 on master

#Keep going back
[root@master ~]# ssh-copy-id 192.168.150.158
[root@master ~]# ssh-copy-id 192.168.150.244

(3) Password free authentication configured to database nodes master and slave 2 on slave 1

[root@slave1 ~]# ssh-keygen -t rsa
#Keep going back
[root@slave1 ~]# ssh-copy-id 192.168.150.240
[root@slave1 ~]# ssh-copy-id 192.168.150.244

(4) Password free authentication configured to database nodes master and slave1 on slave2

[root@slave2 ~]# ssh-keygen -t rsa
#Keep going back
[root@slave2 ~]# ssh-copy-id 192.168.150.240
[root@slave2 ~]# ssh-copy-id 192.168.150.158

7. Configure MHA (on the manager node)
(1) Copy related scripts to / usr/local/bin directory

[root@manager ~]# cp -ra /root/mha4mysql-manager-0.57/samples/scripts /usr/local/bin

There will be four executables after copying

  • Master? IP? Failover: script of VIP management during automatic switch
  • Master IP online change: management of vip during online switch
  • Power manager: script to shut down the host after a failure
  • send_report: script to send alarm after failover

(2) Copy the above VIP management script to the directory / usr/local/bin during the automatic switch, and use the script to manage the VIP

[root@manager ~]# cp /usr/local/bin/scripts/master_ip_failover /usr/local/bin

(3) Modify the master IP failover script (delete the original content and write again)

[root@manager ~]# vim /usr/local/bin/master_ip_failover
#!/usr/bin/env perl
use strict;
use warnings FATAL => 'all';

use Getopt::Long;

my (
$command, $ssh_user, $orig_master_host, $orig_master_ip,
$orig_master_port, $new_master_host, $new_master_ip, $new_master_port
);
#############################Add content section#########################################
#Set drift IP
my $vip = '192.168.150.200';
my $brdc = '192.168.150.255';
my $ifdev = 'ens33';
my $key = '1';
my $ssh_start_vip = "/sbin/ifconfig ens33:$key $vip";
my $ssh_stop_vip = "/sbin/ifconfig ens33:$key down";
my $exit_code = 0;
#my $ssh_start_vip = "/usr/sbin/ip addr add $vip/24 brd $brdc dev $ifdev label $ifdev:$key;/usr/sbin/arping -q -A -c 1 -I $ifdev $vip;iptables -F;";
#my $ssh_stop_vip = "/usr/sbin/ip addr del $vip/24 dev $ifdev label $ifdev:$key";
##################################################################################
GetOptions(
'command=s' => \$command,
'ssh_user=s' => \$ssh_user,
'orig_master_host=s' => \$orig_master_host,
'orig_master_ip=s' => \$orig_master_ip,
'orig_master_port=i' => \$orig_master_port,
'new_master_host=s' => \$new_master_host,
'new_master_ip=s' => \$new_master_ip,
'new_master_port=i' => \$new_master_port,
);

exit &main();

sub main {

print "\n\nIN SCRIPT TEST====$ssh_stop_vip==$ssh_start_vip===\n\n";

if ( $command eq "stop" || $command eq "stopssh" ) {

my $exit_code = 1;
eval {
print "Disabling the VIP on old master: $orig_master_host \n";
&stop_vip();
$exit_code = 0;
};
if ($@) {
warn "Got Error: $@\n";
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "start" ) {

my $exit_code = 10;
eval {
print "Enabling the VIP - $vip on the new master - $new_master_host \n";
&start_vip();
$exit_code = 0;
};
if ($@) {
warn $@;
exit $exit_code;
}
exit $exit_code;
}
elsif ( $command eq "status" ) {
print "Checking the Status of the script.. OK \n";
exit 0;
}
else {
&usage();
exit 1;
}
}
sub start_vip() {
`ssh $ssh_user\@$new_master_host \" $ssh_start_vip \"`;
}
# A simple system call that disable the VIP on the old_master
sub stop_vip() {
`ssh $ssh_user\@$orig_master_host \" $ssh_stop_vip \"`;
}

sub usage {
print
"Usage: master_ip_failover --command=start|stop|stopssh|status --orig_master_host=host --orig_master_ip=ip --orig_master_port=port --new_master_host=host --new_master_ip=ip --new_master_port=port\n";
}

Note: for the first configuration, you need to go to the master and start the virtual IP manually

[root@master ~]# /sbin/ifconfig ens33:1 192.168.150.200/24

(4) Create MHA software directory and copy configuration files

[root@manager ~]# mkdir /etc/masterha
[root@manager ~]# cp /root/mha4mysql-manager-0.57/samples/conf/app1.cnf /etc/masterha/
[root@manager ~]# vim /etc/masterha/app1.cnf 
[server default]
#manager Profile
manager_log=/var/log/masterha/app1/manager.log     
#manager log
manager_workdir=/var/log/masterha/app1
#The location where the master saves the binlog. The path here should be the same as that of the bilog configured in the master
master_binlog_dir=/home/mysql
#Set the switch script for automatic failover. That's the script above
master_ip_failover_script=/usr/local/bin/master_ip_failover
#Set the switch script for manual switch
master_ip_online_change_script=/usr/local/bin/master_ip_online_change
#This password is the one used to create the monitoring user
password=manager
ping_interval=1
remote_workdir=/tmp
#Set copy user password
repl_password=123
#Set users for replication users
repl_user=myslave
#Set the script of alarm after switching
secondary_check_script=/usr/local/bin/masterha_secondary_check -s 192.168.150.158 -s 192.168.150.244
#Set failure shutdown failure script host
shutdown_script=""
#Set the login user name of ssh
ssh_user=root
#Set monitoring user
user=mha

[server1]
hostname=192.168.150.240
port=3306

[server2]
#Set as candidate master. If this parameter is set, the sender will upgrade from the slave to the master after the master-slave switch
candidate_master=1
#By default, if a slave is 100m behind the master's relay logs, MHA will not select the slave as the new master
check_repl_delay=0
hostname=192.168.150.158
port=3306

[server3]
hostname=192.168.150.244
port=3306

7, test

  • SSH interactive login free (manager node)
[root@manager ~]# masterha_check_ssh -conf=/etc/masterha/app1.cnf
···Omit part of the content
#If it is normal, it will output successfully
Sun Jan 12 19:19:11 2020 - [info] All SSH connection tests passed successfully.
  • Test mysql master-slave connection (health status)
[root@manager ~]# masterha_check_repl -conf=/etc/masterha/app1.cnf
···Omit part of the content
#Prompt OK, the configuration is normal
MySQL Replication Health is OK.

8. Start MHA and check MHA status

#Start MHA and run it in the background
[root@manager ~]# nohup masterha_manager --conf=/etc/masterha/app1.cnf --remove_dead_master_conf --ignore_last_failover < /dev/null > /var/log/masterha/app1/manager.log 2>&1 &
[1] 13525
#Viewing the MHA status, you can see that the current master is a mysql node
[root@manager ~]# masterha_check_status --conf=/etc/masterha/app1.cnf
app1 (pid:13525) is running(0:PING_OK), master:192.168.150.240
  • – remove? Dead? Master? Conf: this parameter indicates that the old ip will be removed from the configuration file when the master-slave switch occurs
  • – ignore ﹣ last ﹣ failover: by default, if MHA detects a continuous outage and the interval between two outages is less than 8 hours, it will not fail. The reason for this limitation is to avoid ping pong effect. This parameter means that the files generated after the last MHA trigger switch are ignored. By default, the log directory will be recorded after the MHA switch, that is, the above For the set appl.failover.complete file, if the file exists in the directory during the next switch, it is not allowed to trigger the switch, unless the file is deleted after the first switch. For convenience, it is set to - ignore ﹣ last ﹣ failover
    9, validation
    (1) Start monitoring observation logging on manager

    (2) Analog fault
  • Remote login via virtual IP using client
  • Stop the mysql service on the master database
[root@master ~]# pkill -9 mysqld
  • At this time, manager detects that the primary server is down and switches the standby primary server to the primary server
  • View IP address on slave1
[root@slave1 ~]# ifconfig
ens33: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.150.158  netmask 255.255.255.0  broadcast 192.168.150.255
···
#Virtual IP address translation to standby primary server
ens33:1: flags=4163<UP,BROADCAST,RUNNING,MULTICAST>  mtu 1500
        inet 192.168.150.200  netmask 255.255.255.0  broadcast 192.168.150.255
···
  • Use the client to log in through the virtual IP address, or log in
Published 74 original articles, won praise 149, visited 6424
Private letter follow

Tags: MySQL ssh Database vim

Posted on Sun, 12 Jan 2020 04:32:18 -0800 by TimUSA