How does Redis master-slave replication work? Do you know how to keep high performance while synchronizing data?
- https://redis.io/topics/replication Note that based on the latest version of redis 5, the slave term and configuration item have been officially changed to replica. In fact, they refer to slave nodes.
Basic process of master-slave replication
# Master-Replica replication. Use replicaof to make a Redis instance a copy of # another Redis server. A few things to understand ASAP about Redis replication. # # +------------------+ +---------------+ # | Master | ---> | Replica | # | (receive writes) | | (exact copy) | # +------------------+ +---------------+ # # 1) Redis replication is asynchronous, but you can configure a master to # stop accepting writes if it appears to be not connected with at least # a given number of replicas. # 2) Redis replicas are able to perform a partial resynchronization with the # master if the replication link is lost for a relatively small amount of # time. You may want to configure the replication backlog size (see the next # sections of this file) with a sensible value depending on your needs. # 3) Replication is automatic and does not need user intervention. After a # network partition replicas automatically try to reconnect to masters # and resynchronize with them. # # replicaof <masterip> <masterport>
Basic process of Master and slave replica replication
- When the connection between the Master master and the replica is stable, the Master continuously performs incremental resync, sends the incremental data to the replica, and the replica updates its own data after receiving the data, and reports the processing situation to the Master by REPLCONF ACK PING every second.
- If replica is disconnected from and reconnected with Master, replica attempts to send PSYNC command to Master. If the condition is satisfied (for example, a known historical replica is referenced, or the backlog is sufficient), then partial resync will be triggered. Otherwise, the Master will trigger a full resync to the replica
From the above basic process, we can see that if there is a problem with the network, we can cause full resync, which will seriously affect the data progress of catching up with the master from replica. So how to solve it? There are two aspects: master-slave response time strategy and master-slave space accumulation strategy.
Master slave response time policy
- 1. PING the Master every repl PING replica period second to check whether the Master is hung.
- 2. The replication timeout between replica (salve) and Master is 60s by default
- a) From the perspective of replica, RDB data transmitted by the master is not received during full synchronization of SYNC
- b) From the perspective of replica, there is no packet sent by the master or PING response sent by replica
- c) master angle, no repconf ack rings received from replica. When redis detects the repl timeout (the default value is 60s), the connection between the master and slave will be closed, and redis replica initiates the request to reestablish the master-slave connection.
Master-slave space accumulation strategy
After the Master receives the data write, it will write to the replication buffer (this is mainly used for the data transmission buffer of Master-slave replication), and also write to the backlog replication backlog. When replica disconnects and reconnects PSYNC (including replication ID and currently processed offset), if the historical replica can be found in the replication backlog, then partial resync will be triggered, otherwise it will be triggered A Master synchronizes to the replica in full resync.
# Set the replication backlog size. The backlog is a buffer that accumulates # replica data when replicas are disconnected for some time, so that when a replica # wants to reconnect again, often a full resync is not needed, but a partial # resync is enough, just passing the portion of data the replica missed while # disconnected. # # The bigger the replication backlog, the longer the time the replica can be # disconnected and later be able to perform a partial resynchronization. # # The backlog is only allocated once there is at least a replica connected. # # repl-backlog-size 1mb
Parameters related to backlog replication backlog:
# Incremental synchronization window repl-backlog-size 1mb repl-backlog-ttl 3600
full resync full synchronization workflow
Full synchronous workflow:
- replica sends PSYNC. (assuming the condition of full synchronization is met)
- Master processes full synchronization through subprocesses. Subprocesses write snapshots through BGSAVE command and fork subprocesses dump.rdb . At the same time, the master starts buffering all new write commands received from the client to the replication buffer.
- The Master subprocess transmits rdb data to replica through the network card.
- replica saves rdb data to disk and then loads it into memory (delete old data and block loading new data) (incremental synchronization follows)
If the master disk is slow and the bandwidth is good, the diskless mode can be used (note that this is experimental):
repl-diskless-sync no --> yes Turn on diskless mode repl-diskless-sync-delay 5
replica can provide services by default during full synchronization or disconnection.
Replica will block the client's connection in the time window when replica is loaded into memory.
Allow writes only with N attached replicas
By default, the master uses asynchronous replication, which means that the client writes the command. The master needs to confirm by himself, and confirms that there are at least N copies, and the delay is less than M seconds, then the master will accept the write, otherwise an error will be returned
#It is not enabled by default Min replicas to write Min replicas Max lag < seconds >
In addition, the Client client can use the WAIT command similar to the ACK mechanism to ensure that there are a specified number of confirmed copies in other Redis instances.
127.0.0.1:9001>set a x OK. 127.0.0.1:9001>wait 1 1000 1
replication ID is mainly used to identify the dataset ID from the current master. There are two replication ID S: master_replid，master_replid2
127.0.0.1:9001> info replication # Replication role:master connected_slaves:1 slave0:ip=127.0.0.1,port=9011,state=online,offset=437,lag=1 master_replid:9ab608f7590f0e5898c4574299187a52ad0db7ec master_replid2:0000000000000000000000000000000000000000 master_repl_offset:437 second_repl_offset:-1 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:437
When the master is suspended and one of the replicas is upgraded to master, it will open a new era and generate a new replication ID: master_replid At the same time, the old master_replid set to master_replid2.
# Replication role:master connected_slaves:2 slave0:ip=127.0.0.1,port=9021,state=online,offset=34874,lag=0 slave1:ip=127.0.0.1,port=9001,state=online,offset=34741,lag=0 master_replid:dfa343264a79179c1061f8fb81d49077db8e4e5f master_replid2:9ab608f7590f0e5898c4574299187a52ad0db7ec master_repl_offset:34874 second_repl_offset:6703 repl_backlog_active:1 repl_backlog_size:1048576 repl_backlog_first_byte_offset:1 repl_backlog_histlen:34874
In this way, other replica connections to the new master do not need another full synchronization. You can continue to synchronize the replica and use the new era data.
How does replica handle expired keys?
- Replica does not actively delete the expired key. Replica will delete the expired key only when Master passes the memory elimination strategy such as LRU or actively accesses the expired key, and the composite DEL command is given to replica
- There is a time difference in the above. The internal logic clock of replica is used. When the client tries to read an expired key, replica will report that it does not exist.
More attention to WeChat official account, focus on sharing the dry cargo related to server development and programming: