• 119046

    文章

  • 803

    评论

  • 12

    友链

  • 最近新加了换肤功能,大家多来逛逛吧~~~~
  • 喜欢这个网站的朋友可以加一下QQ群,我们一起交流技术。

redis主从同步参数repl-backlog-size测算

撸了今年阿里、腾讯和美团的面试,我有一个重要发现.......>>

一、背景

在不影响正常业务的情况下redis主从同步时总会出现timeout,部分同步失败的情况。需要评估一下repl-backlog-size的大小,来避免复制时出现异常

二、步骤

1、获取数据

借助redis-cli 、info replication..收集master_repl_offset每一分钟前后的数据,其中使用了sed 和awk命令对数据进行处理,为后期分析提供便利。

脚本如下

#!/bin/bash
# @date: 2020-07-02
# @author: ninesun
# @parm : null
# @desc: save master_repl_offset by mins

echo "start!"
for((i=6379; i<6387; i++))
do
redis-cli -p $i info replication|grep master_repl_offset|sed 's/:/|/g'|awk '{print strftime("%Y-%m-%d %H:%M:%S"),$0}'|sed 's/ /|/g'|sed 's/|/ /'|sed '
s/^/'$i\|'/' >> /home/scripts/redis/redisParm.csv
echo "port $i save successful!"
done
echo "end!"

2、结果

3、分析数据并评估大小

将格式化号的数据借助GP的gpfdist装载到数据库中。

3.1 建表

create table sor.redisparam_info
( port character varying( 50 ), evt_timestamp timestamp( 0 ) without time zone, 
param character varying( 200 ),param_value character varying( 200 ),
CONSTRAINT redisparam_info_pkey PRIMARY KEY (evt_timestamp,port))
DISTRIBUTED by( evt_timestamp )
partition by range( evt_timestamp )( partition p202004 start( '2020-04-01'::date )
end( '2020-04-30'::date ), partition p202005 start( '2020-05-01'::date )
end( '2020-05-31'::date ), partition p202006 start( '2020-06-01'::date )
end( '2020-06-30'::date ), partition p202007 start( '2020-07-01'::date )
end( '2020-07-31'::date ), partition p202008 start( '2020-08-01'::date )
end( '2020-08-31'::date ), partition p202009 start( '2020-09-01'::date )
end( '2020-09-30'::date ), partition p202010 start( '2020-10-01'::date )
end( '2020-10-31'::date ), partition p202011 start( '2020-11-01'::date )
end( '2020-11-30'::date ), partition p202012 start( '2020-12-01'::date )
end( '2020-12-31'::date ));

drop external table ext_redisparam_info	

CREATE EXTERNAL TABLE ext_redisparam_info (like sor.redisparam_info) LOCATION (
    'gpfdist://******:8100/redisParm.csv'
) FORMAT 'text' (delimiter E'|' null E'\\N' escape E'\\') 
 SEGMENT REJECT LIMIT 1000 ROWS

3.2 使用表数据分析。

开启gpfdist后,如果不需要存入堆表,可直接在外部表上进行分析。

分析sql如下,逻辑为by 时间排序,计算前后两者差距。使用

select *,round(t.diff/1024.0,2) as kb,round(t.diff/1024.0/1024.0,2) mb
from( select *, param_value::bigint -( lead( param_value::bigint ) over(
order by evt_timestamp desc )) as diff
from sor.redisparam_info
where port = '6384'
 ) t

结果

3.3 最终分析结果

取8个node的平均值,最终得到一个估计结果。

三、关于repl-backlog-size 

这个参数官方的解释。

# Set the replication backlog size. The backlog is a buffer that accumulates
# slave data when slaves are disconnected for some time, so that when a slave
# wants to reconnect again, often a full resync is not needed, but a partial
# resync is enough, just passing the portion of data the slave missed while
# disconnected.
#
# The bigger the replication backlog, the longer the time the slave can be
# disconnected and later be able to perform a partial resynchronization.
#
# The backlog is only allocated once there is at least a slave connected.
#
# repl-backlog-size 1mb

 我的疑问

https://github.com/redis-io/redis/issues/1400

 


695856371Web网页设计师②群 | 喜欢本站的朋友可以收藏本站,或者加入我们大家一起来交流技术!

0条评论

Loading...


自定义皮肤 主体内容背景
打开支付宝扫码付款购买视频教程
遇到问题联系客服QQ:419400980
注册梁钟霖个人博客