postgresql流复制原理以及流复制和逻辑复制的区别说明

更新时间:2023-07-12 19:04:58 阅读：评论：0

than流复制的原理：

物理复制也叫流复制，流复制的原理是主库把WAL发送给备库，备库接收WAL后，进⾏重放。

逻辑复制的原理：

逻辑复制也是基于WAL⽂件，在逻辑复制中把主库称为源端库，备库称为⽬标端数据库，源端数据库根据预先指定好的逻辑解析规则对WAL⽂件进⾏解析，把DML操作解析成⼀定的逻辑变化信息（标准SQL语句），源端数据库把标准SQL语句发给⽬标端数据库，⽬标端数据库接收到之后进⾏应⽤，从⽽实现数据同步。

流复制和逻辑复制的区别：

流复制主库上的事务提交不需要等待备库接收到WAL⽂件后的确认，逻辑复制相反。

流复制要求主备库的⼤版本⼀致，逻辑复制可以跨⼤版本的数据同步，也可以实现异构数据库的数据同步。

流复制的主库可读写，从库只允许读，逻辑复制的⽬标端数据库要求可读写

流复制是对实例级别的复制（整个postgresql数据库），逻辑复制是选择性的复制⼀些表，所以是对表级别的复制。

流复制有主库的DDL、DML操作，逻辑复制只有DML操作。

补充：PostgreSQL 同步流复制原理和代码浅析

背景

数据库ACID中的持久化如何实现

数据库ACID⾥⾯的D，持久化。指的是对于⽤户来说提交的事务，数据是可靠的，即使数据库crash了，在硬件完好的情况下，也能恢复回来。

PostgreSQL是怎么做到的呢，看⼀幅图，画得⽐较丑，凑合看吧。

假设⼀个事务，对数据库做了⼀些操作，并且产⽣了⼀些脏数据，⾸先这些脏数据会在数据库的shared buffer中。

同时，产⽣这些脏数据的同时也会产⽣对应的redo信息，产⽣的REDO会有对应的LSN号（你可以理

解为REDO 的虚拟地址空间的⼀个唯⼀的OFFSET，每⼀笔REDO都有），这个LSN号也会记录到shared buffer中对应的脏页中。

walwriter是负责将wal buffer flush到持久化设备的进程，同时它会更新⼀个全局变量，记录已经flush的最⼤的LSN号。bgwriter是负责将shared buffer的脏页持久化到持久化设备的进程，它在flush时，除了要遵循LRU算法之外，还要通过LSN全局变量的⽐对，来保证脏页对应的REDO记录已经flush到持久化设备了，如果发现还对应的REDO没有持久化，会触发WAL writer去flush wal buffer。 (即确保⽇志⽐脏数据先落盘)

当⽤户提交事务时，也会产⽣⼀笔提交事务的REDO，这笔REDO也携带了LSN号。backend process 同样需要等待对应LSN flush到磁盘后才会返回给⽤户提交成功的信号。(保证⽇志先落盘，然后返回给⽤户)

数据库同步复制原理浅析

同步流复制，即保证standby节点和本地节点的⽇志双双落盘。

PostgreSQL使⽤另⼀组全局变量，记录同步流复制节点已经接收到的XLOG LSN，以及已经持久化的XLOG LSN。

西部支教网⽤户在发起提交请求后，backend process除了要判断本地wal有没有持久化，同时还需要判断同步流复制节点的XLOG有没有接收到或持久化（通过synchronous_commit参数控制）。

如果同步流复制节点的XLOG还没有接收或持久化，backend process会进⼊等待状态。

数据库同步复制代码浅析

对应的代码和解释如下：

CommitTransaction @ src/backend/access/transam/xact.c

RecordTransactionCommit @ src/backend/access/transam/xact.c

* If we didn't create XLOG entries, we're done here; otherwi we

* should trigger flushing tho entries the same as a commit record

* would. This will primarily happen for HOT pruning and the like; we

* want the to be flushed to disk in due time.

if (!wrote_xlog) // 没有产⽣redo的事务，直接返回

goto cleanup;

if (wrote_xlog && markXidCommitted) // 如果产⽣了redo, 等待同步流复制

SyncRepWaitForLSN(XactLastRecEnd);

lkmSyncRepWaitForLSN @ src/backend/replication/syncrep.c

* Wait for synchronous replication, if requested by ur.

* Initially backends start in state SYNC_REP_NOT_WAITING and then

* change that state to SYNC_REP_WAITING before adding ourlves

* to the wait queue. During SyncRepWakeQueue() a WALSender changes

* the state to SYNC_REP_WAIT_COMPLETE once replication is confirmed.

* This backend then rets its state to SYNC_REP_NOT_WAITING.

void

SyncRepWaitForLSN(XLogRecPtr XactCommitLSN)

{

...

* Fast exit if ur has not requested sync replication, or there are no

* sync replication standby names defined. Note that tho standbys don't

* need to be connected.

if (!SyncRepRequested() || !SyncStandbysDefined()) // 如果不是同步事务或者没有定义同步流复制节点，直接返回

return;

oltp

...

* We don't wait for sync rep if WalSndCtl->sync_standbys_defined is not

* t. See SyncRepUpdateSyncStandbysDefined.

* Also check that the standby hasn't already replied. Unlikely race

* condition but we'll be fetching that cache line anyway so it's likely

* to be a low cost check.

if (!WalSndCtl->sync_standbys_defined ||

XactCommitLSN <= WalSndCtl->lsn[mode]) // 如果没有定义同步流复制节点，或者判断到commit lsn⼩于已同步的LSN，说明XLOG已经flush了，直接返回。 {

LWLockRelea(SyncRepLock);

return;

}

...

// 进⼊循环等待状态，说明本地的xlog已经flush了，只是等待同步流复制节点的REDO同步状态。

/*scout

* Wait for specified LSN to be confirmed.

* Each proc has its own wait latch, so we perform a normal latch

* check/wait loop here.

for (;;) // 进⼊等待状态，检查latch是否满⾜释放等待的条件（wal nder会根据REDO的同步情况，实时更新对应的latch）

{

int syncRepState;

/* Must ret the latch before testing state. */

RetLatch(&MyProc->procLatch);

syncRepState = MyProc->syncRepState;

if (syncRepState == SYNC_REP_WAITING)

{

LWLockAcquire(SyncRepLock, LW_SHARED);

syncRepState = MyProc->syncRepState;

LWLockRelea(SyncRepLock);

}

if (syncRepState == SYNC_REP_WAIT_COMPLETE) // 说明XLOG同步完成，退出等待

break;

// 如果本地进程挂了，输出的消息内容是，本地事务信息已持久化，但是远程也许还没有持久化

if (ProcDiePending)

{

ereport(WARNING,

(errcode(ERRCODE_ADMIN_SHUTDOWN),

errmsg("canceling the wait for synchronous replication and terminating connection due to administrator command"),

errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));

whereToSendOutput = DestNone;

SyncRepCancelWait();

break;

}

// 如果⽤户主动cancel query，输出的消息内容是，本地事务信息已持久化，但是远程也许还没有持久化

if (QueryCancelPending)

{

QueryCancelPending = fal;

ereport(WARNING,

(errmsg("canceling wait for synchronous replication due to ur request"),

errdetail("The transaction has already committed locally, but might not have been replicated to the standby.")));

SyncRepCancelWait();

break;

}

// 如果postgres主进程挂了，进⼊退出流程。

if (!PostmasterIsAlive())

{

ProcDiePending = true;

whereToSendOutput = DestNone;

SyncRepCancelWait();

break;

}

// 等待wal nder来修改对应的latch

* Wait on latch. Any condition that should wake us up will t the

* latch, so no need for timeout.

texture是什么意思WaitLatch(&MyProc->procLatch, WL_LATCH_SET | WL_POSTMASTER_DEATH, -1);

注意⽤户进⼊等待状态后，只有主动cancel , 或者kill(terminate) , 或者主进程die才能退出⽆限的等待状态。后⾯会讲到如何将同步级别降级为异步。

前⾯提到了，⽤户端需要等待LATCH的释放信号。

那么谁来给它这个信号了，是wal nder进程，源码和解释如下 :

src/backend/replication/walnder.c

StartReplication

WalSndLoop

ProcessRepliesIfAny

ProcessStandbyMessage

ProcessStandbyReplyMessage

if (!am_cascading_walnder) // ⾮级联流复制节点，那么它将调⽤SyncRepReleaWaiters修改backend process等待队列中它们对应的 latch。

SyncRepReleaWaiters();

SyncRepReleaWaiters @ src/backend/replication/syncrep.c

* Update the LSNs on each queue bad upon our latest state. This

* implements a simple policy of first-valid-standby-releas-waiter.

* Other policies are possible, which would change what we do here and what

* perhaps also which information we store as well.

void

SyncRepReleaWaiters(void)

{

...

// 释放满⾜条件的等待队列voa special english

* Set the lsn first so that when we wake backends they will relea up to

* this location.

if (walsndctl->lsn[SYNC_REP_WAIT_WRITE] < MyWalSnd->write)

{

walsndctl->lsn[SYNC_REP_WAIT_WRITE] = MyWalSnd->write;

numwrite = SyncRepWakeQueue(fal, SYNC_REP_WAIT_WRITE);

gook

}

if (walsndctl->lsn[SYNC_REP_WAIT_FLUSH] < MyWalSnd->flush)

{

walsndctl->lsn[SYNC_REP_WAIT_FLUSH] = MyWalSnd->flush;

numflush = SyncRepWakeQueue(fal, SYNC_REP_WAIT_FLUSH);

}

sheer

...

SyncRepWakeQueue @ src/backend/replication/syncrep.c

* Walk the specified queue from head. Set the state of any backends that

* need to be woken, remove them from the queue, and then wake them.

* Pass all = true to wake whole queue; otherwi, just wake up to

* the walnder's LSN.

* Must hold SyncRepLock.

static int

SyncRepWakeQueue(bool all, int mode)

{

...

while (proc) // 修改对应的backend process 的latch

{

* Assume the queue is ordered by LSN

if (!all && walsndctl->lsn[mode] < proc->waitLSN)

return numprocs;

* Move to next proc, so we can delete thisproc from the queue.

* thisproc is valid, proc may be NULL after this.

thisproc = proc;

proc = (PGPROC *) SHMQueueNext(&(WalSndCtl->SyncRepQueue[mode]),

&(proc->syncRepLinks),

offtof(PGPROC, syncRepLinks));

* Set state to complete; e SyncRepWaitForLSN() for discussion of

* the various states.

thisproc->syncRepState = SYNC_REP_WAIT_COMPLETE; // 满⾜条件时，改成SYNC_REP_WAIT_COMPLETE

....

如何设置事务可靠性级别

PostgreSQL ⽀持在会话中设置事务的可靠性级别。

off 表⽰commit 时不需要等待wal 持久化。

local 表⽰commit 是只需要等待本地数据库的wal 持久化。

remote_write 表⽰commit 需要等待本地数据库的wal 持久化，同时需要等待sync standby节点wal write buffer完成(不需要持久化)。

on 表⽰commit 需要等待本地数据库的wal 持久化，同时需要等待sync standby节点wal持久化。

提醒⼀点， synchronous_commit 的任何⼀种设置，都不影响wal⽇志持久化必须先于shared buffer脏数据持久化。所以不管你怎么设置，都不好影响数据的⼀致性。

synchronous_commit = off # synchronization level;

# off, local, remote_write, or on

如何实现同步复制降级

从前⾯的代码解析可以得知，如果 backend process 进⼊了等待循环，只接受⼏种信号降级。并且降级后会告警，表⽰本地wal 已持久化，但是sync standby节点不确定wal有没有持久化。

如果你只配置了1个standby，并且将它配置为同步流复制节点。⼀旦出现⽹络抖动，或者sync standby节点故障，将导致同步事务进⼊等待状态。

怎么降级呢？

⽅法1.

修改配置⽂件并重置

$ f

synchronous_commit = local

$ pg_ctl reload

然后cancel 所有query .

postgres=# lect pg_cancel_backend(pid) from pg_stat_activity where pid<>pg_backend_pid();

收到这样的信号，表⽰事务成功提交，同时表⽰WAL不知道有没有同步到sync standby。

WARNING: canceling wait for synchronous replication due to ur request

DETAIL: The transaction has already committed locally, but might not have been replicated to the standby.

COMMIT

postgres=# show synchronous_commit ;

synchronous_commit

--------------------

off

(1 row)

同时它会读到全局变量synchronous_commit 已经是 local了。

这样就完成了降级的动作。

⽅法2.

⽅法1的降级需要对已有的正在等待wal sync的pid使⽤cancel进⾏处理，有点不⼈性化。

可以通过修改代码的⽅式，做到更⼈性化。

SyncRepWaitForLSN for循环中，加⼀个判断，如果发现全局变量sync commit变成local, off了，则告警并退出。这样就不需要⼈为的去cancel query了.

WARNING: canceling wait for synchronous replication due to ur request

DETAIL: The transaction has already committed locally, but might not have been replicated to the standby.

以上为个⼈经验，希望能给⼤家⼀个参考，也希望⼤家多多⽀持。如有错误或未考虑完全的地⽅，望不吝赐教。

本文发布于:2023-07-12 19:04:58，感谢您对本站的认可！

本文链接：https://www.wtabcd.cn/fanwen/fan/90/175360.html

上一篇：[Android]Ubuntu12.04下编译和下载Android4.0.3源码

下一篇：react-hooks官方文档笔记

标签：复制等待数据库节点事务逻辑需要对应

留言与评论（共有 0 条评论）