Redis哨兵模式实现主从故障互切换的方法

更新时间:2023-04-08 12:06:27 阅读：评论：0

redis ntinel 是一个分布式系统，你可以在一个架构中运行多个 ntinel 进程（progress），这些进程使用流言协议（gossip protocols)来接收关于主服务器是否下线的信息，并使用投票协议（agreement protocols）来决定是否执行自动故障迁移，以及选择哪个从服务器作为新的主服务器。

虽然 redis ntinel 释出为一个单独的可执行文件 redis-ntinel ，但实际上它只是一个运行在特殊模式下的 redis 服务器，你可以在启动一个普通 redis 服务器时通过给定 –ntinel 选项来启动 redis ntinel 。

ntinel 系统用于管理多个 redis 服务器（instance），该系统执行以下三个任务：

1、监控（monitoring）： ntinel 会不断地检查你的主服务器和从服务器是否运作正常。

2、提醒（notification）：当被监控的某个 redis 服务器出现问题时， ntinel 可以通过 api 向管理员或者其他应用程序发送通知。

3、自动故障迁移（automatic failover）：当一个主服务器不能正常工作时， ntinel 会开始一次自动故障迁移操作，它会将失效主服务器的其中一个从服务器升级为新的主服务器，并让失效主服务器的其他从服务器改为复制新的主服务器；当客户端试图连接失效的鱼腥草怎么做好吃主服务器时，集群也会向客户端返回新主服务器的地址，使得集群可以使用新主服务器代替失效服务器。

配置

当主宕机了从接替主成为新的主，宕机的主启动后自动变成了从，其实它和mysql的双主模式是一样的互为主从；redis哨兵需要用到redis-ntinel程序和ntinel.conf配置文件。

mkdir -p /usr/local/redismkdir -p /usr/local/redis/6379mkdir -p /usr/local/redis/6380mkdir -p /usr/local/redis/redis_cluster

主配置

vim redis_6379.conf

daemonize yespidfile /usr/local/redis/6379/redis_6379.pidport 6379tcp-backlog 128timeout 0tcp-keepalive 0loglevel noticelogfile ""databas 16save 900 1    ###savesave 300 10save 60 10000stop-writes-on-bgsave-error yesrdbcompression yesrdbchecksum yesdbfilename dump.rdb   ###dbfiledir "/usr/local/redis/6379"masterauth "123456"requirepass "123456"slave-rve-stale-data yesslave-read-only yesrepl-diskless-sync norepl-diskless-sync-delay 5repl-disable-tcp-nodelay noslave-priority 100appendonly yesappendfilename "appendonly.aof"appendfsync everycno-appendfsync-on-rewrite noauto-aof-rewrite-percentage 100auto-aof-rewrite-min-size 64mbaof-load-truncated yeslua-time-limit 5000slowlog-log-slower-对比的修辞手法than 10000slowlog-max-len 128latency-monitor-threshold 0notify-keyspace-events ""hash-max-ziplist-entries 512hash-max-ziplist-value 64list-max-ziplist-entries 512list-max-ziplist-value 64t-max-intt-entries 512zt-max-ziplist-entries 128zt-max-ziplist-value 64hll-spar-max-bytes 3000activerehashing yesclient-output-buffer-limit normal 0 0 0client-output-buffer-limit slave 256mb 64mb 60client-output-buffer-limit pubsub 32mb 8mb 60hz 10aof-rewrite-incremental-fsync yes

vim ntinel_1.conf

哨兵文件配置

port 6000dir "/usr/local/redis/ntinel"# 守护进程模式daemonize yesprotected-mode nologfile "/usr/local/ntinel/ntinel.log"

从配置

vim redis_6380.conf

daemonize yespidfile "/usr/local/redis/6380/redis_6380.pid"port 6380tcp-backlog 128timeout 0tcp-keepalive 0loglevel noticelogfile ""databas 16save 900 1save 300 10save 60 10000stop-writes-on-bgsave-error yesrdbcompression yesrdbchecksum yesdbfilename "dump.rdb"dir "/usr/local/redis/6380"masterauth "123456"requirepass "123456"slave-rve-stale-data yesslave-read-only yesrepl-diskless-sync norepl-diskless-sync-delay 5repl-disable-tcp-nodelay noslave-priority 100appendonly yesappendfilename "appendonly.aof"appendfsync everycno-appendfsync-on-rewrite noauto-aof-rewrite-percentage 100auto-aof-rewrite-min-size 64mbaof-load-truncated yeslua-time-limit 5000slowlog-log-slower-than 10000slowlog-max-len 128latency-monitor-threshold 0notify-keyspace-events ""hash-max-ziplist-entries 512hash-max-ziplist-value 64list-max-ziplist-entries 512list-max-ziplist-value 64t-max-intt-entries 512zt-max-ziplist-entries 128zt-max-ziplist-value 64hll-spar-max-bytes 3000activerehashing yesclient-output-buffer-limit normal 0 0 0client-output-buffer-limit slave 256mb 64mb 60client-output-buffer-limit pubsub 32mb 8mb 60hz 10aof-rewrite-incremental-fsync yes

vim ntinel_2.conf

#ntinel端口port 6000#工作路径，注意路径不要和主重复dir "/usr/local/ntinel"# 守护进程模式daemonize yesprotected-mode no# 指明日志文件名logfile "/usr/local/ntinel/ntinel.log"

注意：

1.应用程序连接到哨兵端口，通过指定不同的master名称连接到具体的主副本。

2.哨兵配置文件中只需要配置主从复制中的主副本ip和端口即可，当主从进行切换时哨兵会自动修改哨兵配置文件中的主副本ip为新在主副本ip。

3.一个哨兵配置文件中可以同时配置监控多个主从复制。

4.单个哨兵就可以用来进行主从故障监控，但是如果只有一个ntinel进程，如果这个进程运行出错，或者是网络堵塞，那么将无法实现redis集群的主备切换（单点问题）;<quorum>这个2代表投票数，当2个ntinel认为一个master已经不可用了以后，将会触发failover，才能真正认为该master已经不可用了。（ntinel集群中各个ntinel也有互相通信，通过gossip协议）;所以合理的配置应该是同时启动多个哨兵进程,并且最好是在不同的服务器中启动。

5.注意mymaster的需要在整个网络环境都是唯一的，哨兵之间会自动通过mastername去建立关联关系只要网络环境是相通的。

启动redis

1.主从都要启动

src/redis-rver redis.conf

2.登入到6380建立主从关系

redis-cli -p 6380slaveof 192.168.137.40 6379

配置哨兵

主从两个哨兵都要启动，还可以通过redis-rver方式启动，例如“redis-rver ntinel.conf –ntinel”

1.启动哨兵

src/redis-ntinel ntinel.conf

2.登入哨兵(两台哨兵都需要登入执行)，添加主从监控信息

redis-cli -p 6000

ntinel monitor mymaster 192.168.137.40 6379 2ntinel t mymaster down-after-milliconds 5000ntinel t mymaster failover-timeout 15000ntinel t mymaster auth-pass 123456

启动报错处理

错误1：

warning overcommit_memory is t to 0! background save may fail under low memory condition. to fix this issue add ‘vm.overcommit_memory = 1’ to /etc/sysctl.conf and then reboot or run the command ‘sysctl vm.overcommit_memory=1’ for this to take effect.

两个解决方法(overcommit_memory)

1. echo “vm.overcommit_memory=1” > /etc/sysctl.conf 或 vi /etcsysctl.conf , 然后reboot重启机器

2. echo 1 > /proc/sys/vm/overcommit_memory 不需要启机器就生效

overcommit_memory参数说明：

设置内存分配策略（可选，根据服务器的实际情况进行设置）

/proc/sys/vm/overcommit_memory

可选值：0、1、2。

0，表示内核将检查是否有足够的可用内存供应用进程使用；如果有足够的可用内存，内存申请允许；否则，内存申请失败，并把错误返回给应用进程。

1，表示内核允许分配所有的物理内存，而不管当前的内存状态如何。

2，表示内核允许分配超过所有物理内存和交换空间总和的内存

注意：redis在dump数据的时候，会fork出一个子进程，理论上child进程所占用的内存和parent是一样的，比如parent占用的内存为8g，这个时候也要同样分配8g的内存给child,如果内存无法负担，往往会造成redis服务器的down机或者io负载过高，效率下gpa是什么降。所以这里比较优化的内存分配策略应该设置为 1（表示内核允许分配所有的物理内存，而不管当前的内存状态如何）。

这里又涉及到overcommit和oom。

什么是overcommit和oom

在unix中，当一个用户进程使用malloc()函数申请内存时，假如返回值是null，则这个进程知道当前没有可用内存空间，就会做相应的处理工作。许多进程会打印错误信息并退出。

linux使用另外一种处理方式，它对大部分申请内存的请求都回复”yes”，以便能跑更多更大的程序。因为申请内存后，并不会马上使用内存。这种技术叫做overcommit。

当内存不足时，会发生oom killer(oom=out-of-memory)。它会选择杀死一些进程(用户态进程，不是内核线程)，以便释放内存。

overcommit的策略

linux下overcommit有三种策略(documentation/vm/overcommit-accounting)：

0. 启发式策略。合理的overcommit会被接受，不合理的overcommit会被拒绝。

1. 任何overcommit都会被接受。

2. 当系统分配的内存超过swap+n%*物理ram(n%由vm.overcommit_ratio决定)时，会拒绝commit。

overcommit的策略通过vm.overcommit_memory设置。

overcommit的百分比由vm.overcommit_ratio设置。

# echo 2 > /proc/sys/vm/overcommit_memory

# echo 80 > /proc/sys/vm/overcommit_ratio

当oom-killer发生时，linux会选择杀死哪些进程

选择进程的函数是oom_badness函数(在mm/oom_kill.c中)，该函数会计算每个进程的点数(0~1000)。

点数越高，这个进程越有可能被杀死。

每个进程的点数跟oom_score_adj有关，而且oom_score_adj可以被设置(-1000最低，1000最高)。

错误2：

warning: the tcp backlog tting of 511 cannot be enforced becau /proc/sys/net/core/somaxconn is t to the lower value of 128.

echo 511 > /proc/sys/net/core/somaxconn

错误3：

16433:x 12 jun 14:52:37.734 * incread maximum number of open files to 10032 (it was originally t to 1024).

新装的linux默认只有1024，当负载较大时，会经常出现error: too many open files

ulimit -a：使用可以查看当前系统的所有限制值

vim /etc/curity/limits.conf

在文件的末尾加上

* soft nofile 65535* hard nofile 65535

执行su或者重新关闭连接用户再执行ulimit -a就可以查看修改后的结果。

故障切换机制

1. 启动群集后，群集程序默认会在从库的redis文件中加入连接主的配置

# generated by config rewriteslaveof 192.168.137.40 6379

2.启动群集之后，群集程序默认会在主从的ntinel.conf文件中加入群集信息

主：

port 26379dir "/usr/local/redis-6379"# 守护进程模式daemonize yes# 指明日志文件名logfile "./ntinel.log"ntinel monitor mymaster 192.168.137.40 6379 1ntinel down-after-milliconds mymaster 5000s收藏夹打不开entinel failover-timeout mymaster 18000ntinel auth-pass mymaster 123456# generated by config rewritentinel config-epoch mymaster 0ntinel leader-epoch mymaster 1ntinel known-slave mymaster 192.168.137.40 6380ntinel known-ntinel mymaster 192.168.137.40 26380 c77c5f64aaad0137a228875e531c7127ceeb5c3fntinel current-epoch 1

从：

#ntinel端口port 26380#工作路径dir "/usr/local/redis-6380"# 守护进程模式da银行从业人员资格考试emonize yes# 指明日志文件名logfile "./ntinel.log"#哨兵监控的master，主从配置一样，在进行主从切换时6379会变成当前的master端口，ntinel monitor mymaster 192.168.137.40 6379 1# master或slave多长时间（默认30秒）不能使用后标记为s_down状态。ntinel down-after-milliconds mymaster 5000#若ntinel在该配置值内未能完成failover操作（即故障时master/slave自动切换），则认为本次failover失败。ntinel failover-timeout mymaster 18000#设置master和slaves验证密码ntinel auth-pass mymaster 123456#哨兵程序自动添加的部分# generated by config rewritentinel config-epoch mymaster 0ntinel leader-epoch mymaster 1###指明了当前群集的从库的ip和端口，在主从切换时该值会改变ntinel known-slave mymaster 192.168.137.40 6380###除了当前的哨兵还有哪些监控的哨兵ntinel known-ntinel mymaster 192.168.137.40 26379 7a88891a6147e202a53601ca16a3d438e9d55c9dntinel current-epoch 1

模拟主故障

[root@monitor redis-6380]# ps -ef|grep redisroot       4171      1  0 14:20 ?        00:00:15 /usr/local/redis-6379/src/redis-rver *:6379                          root       4175      1  0 14:20 ?        00:00:15 /usr/local/redis-6380/src/redis-rver *:6380                          root       4305      1  0 15:28 ?        00:00:05 /usr/local/redis-6379/src/redis-ntinel *:26379 [ntinel]                            root       4306      1  0 15:28 ?        00:00:05 /usr/local/redis-6380/src/redis-ntinel *:26380 [ntinel]                            root       4337   4144  0 15:56 pts/1    00:00:00 grep redis[root@monitor redis-6380]# kill -9 4171[root@monitor redis-6380]# ps -ef|grep redisroot       4175      1  0 14:20 ?        00:00:15 /usr/local/redis-6380/src/redis-rver *:6380                          root       4305      1  0 15:28 ?        00:00:05 /usr/local/redis-6379/src/redis-ntinel *:26379 [ntinel]                            root       4306      1  0 15:28 ?        00:00:05 /usr/local/redis-6380/src/redis-ntinel *:26380 [ntinel]                            root       4339   4144  0 15:56 pts/1    00:00:00 grep redis[root@monitor redis-6380]#

从哨兵配置文件中可以看到当前的主库的已经发生了改变

总结

redis的哨兵端口26379、26380使用客户端软件无法连接，使用程序可以连接，客户端软件只能直接连接6379和6380端口。使用哨兵监控当主故障后会自动切换从为主，当主启动后就变成了从。有看到别人只配置单哨兵26379的这种情况，这种情况无法保证哨兵程序自身的高可用。

以上就是redis哨兵模式实现主从故障互切换的方法的详细内容

更多学习内容请访问：

腾讯t3-t4标准精品php架构师教程目录大全，只要你看完保证薪资上升一个台阶（持续更新）

本文发布于:2023-04-08 12:06:26，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/zuowen/d5bc6d28c2039f2e22f0f5a879beb8b5.html

本文word下载地址：Redis哨兵模式实现主从故障互切换的方法.doc

本文 PDF 下载地址：Redis哨兵模式实现主从故障互切换的方法.pdf

上一篇：片仔癀洗面奶怎么样（控油祛痘洗面奶排行）

下一篇：返回列表