Postgresql 流复制错误:WAL 段已删除

dan*_*ish 5 postgresql replication postgresql-9.3

我想设置 PostgreSQL 流复制,但出现以下错误:

FATAL:  could not receive data from WAL stream: 
ERROR:  requested WAL segment 00000001000000000000006A has already been removed.
Run Code Online (Sandbox Code Playgroud)

主控IP:192.168.0.30

从机IP:192.168.0.36

关于大师:

我创建了一个rep仅用于复制的用户。

Postgres config 目录( )内的相关文件/opt/Postgres/9.3/data

pg_hba.conf:

host    replication     rep     192.168.0.36/32   trust
Run Code Online (Sandbox Code Playgroud)

postgresql.conf:

listen_addresses = 'localhost,192.168.0.30'
wal_level = 'hot_standby'
archive_mode = on
archive_command = 'cd .'
max_wal_senders = 1
hot_standby = on
Run Code Online (Sandbox Code Playgroud)

我已经重新启动了该postgres服务。

在从机上:

我已停止该postgres服务,然后将更改应用于两个文件:

pg_hba.conf:

host    replication     rep     192.168.0.30/32  trust
Run Code Online (Sandbox Code Playgroud)

postgresql.conf:

listen_addresses = 'localhost,192.168.0.36'
wal_level = 'hot_standby'
archive_mode = on
archive_command = 'cd .'
max_wal_senders = 1
hot_standby = on
Run Code Online (Sandbox Code Playgroud)

为了复制初始数据库,我已经完成了:

关于大师

用于创建备份标签的内部 postgres backup start 命令:

psql -c "select pg_start_backup('initial_backup');"
Run Code Online (Sandbox Code Playgroud)

...用于将数据库数据传输到从站:

rsync -cva --inplace --exclude=*pg_xlog* /opt/Postgresql/9.3/data/ 192.168.0.36:/opt/Postgresql/9.3/data/
Run Code Online (Sandbox Code Playgroud)

...内部备份停止清理:

psql -c "select pg_stop_backup();"
Run Code Online (Sandbox Code Playgroud)

在从机上

我创建了以下内容recovery.conf

standby_mode = 'on'
primary_conninfo = 'host=192.168.0.30 port=5432 user=rep password=yourpassword'
trigger_file = '/tmp/postgresql.trigger.5432'
Run Code Online (Sandbox Code Playgroud)

在从站上启动postgres服务启动时没有任何错误,但仍在等待:

ps -ef | grep -i postgres

postgres 12959     1  0 13:39 ?        00:00:00 /opt/PostgreSQL/9.3/bin/postgres -D /opt/PostgreSQL/9.3/data
postgres 12969 12959  0 13:39 ?        00:00:00 postgres: logger process                                    
postgres 12970 12959  0 13:39 ?        00:00:00 postgres: startup process   waiting 00000001000000000000006A
Run Code Online (Sandbox Code Playgroud)

同时,在 master 上

ps -ef | grep -i postgres

postgres  5930     1  0 13:39 ?        00:00:01 /opt/PostgreSQL/9.3/bin/postgres -D /opt/PostgreSQL/9.3/data
postgres  5931  5930  0 13:39 ?        00:00:00 postgres: logger process                                    
postgres  5933  5930  0 13:39 ?        00:00:00 postgres: checkpointer process                              
postgres  5934  5930  0 13:39 ?        00:00:00 postgres: writer process                                    
postgres  5935  5930  0 13:39 ?        00:00:00 postgres: wal writer process                                
postgres  5936  5930  0 13:39 ?        00:00:00 postgres: autovacuum launcher process                       
postgres  5937  5930  0 13:39 ?        00:00:00 postgres: archiver process                                  
postgres  5938  5930  0 13:39 ?        00:00:00 postgres: stats collector process      
Run Code Online (Sandbox Code Playgroud)

psql从机上的命令给出:

psql.bin: FATAL:  the database system is starting up
Run Code Online (Sandbox Code Playgroud)

--> cd pg_log 给出了等待的原因:-

FATAL:  could not receive data from WAL stream: ERROR:  requested WAL segment  has already been removed

00000001000000000000006A segment is not in master's pg_xlog but it is in slaves pg_xlog
Run Code Online (Sandbox Code Playgroud)

我该如何解决这个错误?

小智 4

来自PostgreSQL 文档中的流复制:

如果您使用流式复制而不进行基于文件的连续归档,则服务器可能会在备用服务器收到旧的 WAL 段之前回收它们。如果发生这种情况,则需要从新的基础备份重新初始化备用数据库。您可以通过设置wal_keep_segments一个足够大的值来避免这种情况,以确保 WAL 段不会过早回收,或者为备用数据库配置复制槽。如果您设置了可从备用数据库访问的 WAL 存档,则不需要这些解决方案,因为备用数据库始终可以使用存档来赶上,只要它保留足够的段。