带有 `jobs` 标志的 `pg_restore` 导致 `pg_restore: 错误:一个工作进程意外死亡`

Ric*_*mas 9 postgresql pg-restore

我有一个运行以下内容的脚本:

pg_restore tmp/latest.backup --verbose --clean --no-acl --no-owner --dbname hub_development --jobs=12
Run Code Online (Sandbox Code Playgroud)

这经常失败并出现以下错误:

error: could not find block ID 4584 in archive -- possibly due to out-of-order restore request, which cannot be handled due to lack of data offsets in archive

pg_restore: error: a worker process died unexpectedly
Run Code Online (Sandbox Code Playgroud)

反过来,这个错误意味着应该有索引、主键等的表最终没有它们。例如,在没有多核的情况下运行时,我们的users表如下所示,正如预期的那样:


                                           Table "public.users"
       Column       |            Type             | Collation | Nullable |              Default              
--------------------+-----------------------------+-----------+----------+-----------------------------------
 id                 | integer                     |           | not null | nextval('users_id_seq'::regclass)
 created_at         | timestamp without time zone |           | not null | 
 updated_at         | timestamp without time zone |           | not null | 
 email              | character varying           |           | not null | 
 confirmation_token | character varying(128)      |           |          | 
 name               | character varying           |           | not null | ''::character varying
 user_type          | character varying           |           |          | 
 encrypted_password | character varying(128)      |           |          | 
 remember_token     | character varying(128)      |           |          | 
Indexes:
    "users_pkey" PRIMARY KEY, btree (id)
    "index_users_on_email" btree (email)
    "index_users_on_remember_token" btree (remember_token)
Referenced by:
    TABLE "project_feedback_users" CONSTRAINT "fk_rails_08af49ba47" FOREIGN KEY (user_id) REFERENCES users(id)
    TABLE "client_reviews" CONSTRAINT "fk_rails_8fc606dbea" FOREIGN KEY (user_id) REFERENCES users(id)
Run Code Online (Sandbox Code Playgroud)

使用多个内核运行,该表如下所示:


hub_development=# \d users
                                            Table "public.users"
       Column       |            Type             | Collation | Nullable |              Default              
--------------------+-----------------------------+-----------+----------+-----------------------------------
 id                 | integer                     |           | not null | nextval('users_id_seq'::regclass)
 created_at         | timestamp without time zone |           | not null | 
 updated_at         | timestamp without time zone |           | not null | 
 email              | character varying           |           | not null | 
 confirmation_token | character varying(128)      |           |          | 
 name               | character varying           |           | not null | ''::character varying
 user_type          | character varying           |           |          | 
 encrypted_password | character varying(128)      |           |          | 
 remember_token     | character varying(128)      |           |          | 
Run Code Online (Sandbox Code Playgroud)

基于此,我得出的结论是,创建表本身的同一个工作人员不负责添加索引和外键,第二个工作人员试图在第一个工作人员之前运行,这导致了我观察到的错误.

当我删除--jobs=12标志时,脚本工作正常,最坏的解决方法是我只是这样做。

但是,对于我自己的教育,我很想知道是否有一种解决方案可以保留我们使用多核并行化数据库恢复的能力,同时避免无序恢复请求。

该错误提到由于存档中缺少数据偏移量而无法处理乱序还原请求。添加这些数据偏移量会以我描述的方式解决问题吗?如果是这样,我将如何处理,这样做有什么缺点吗?

我不是数据库管理员,我的知识有限,所以如果我没有提供足够的信息来回答这个问题,请告诉我。

我的 Postgres 本地版本是 12.1,数据来自 Heroku 上托管的 Rails 应用程序。这是结果heroku pg:info

=== HEROKU_POSTGRESQL_BRONZE_URL, DATABASE_URL
Plan:                  Standard 0
Status:                Available
Data Size:             3.38 GB
Tables:                44
PG Version:            11.5
Connections:           22/120
Connection Pooling:    Available
Credentials:           1
Fork/Follow:           Available
Rollback:              earliest from 2020-01-10 18:17 UTC
Created:               2019-10-29 18:20 UTC
Region:                us
Data Encryption:       In Use
Continuous Protection: On
Maintenance:           not required
Maintenance window:    Wednesdays 18:00 to 22:00 UTC
Add-on:                postgresql-metric-02684
Run Code Online (Sandbox Code Playgroud)

ldr*_*drg 5

更新:“工作进程意外死亡”问题已在 PostgreSQL 12.4 上游修复!升级以获得修复。

为了获得最佳兼容性和性能,pg_restore请将pg_dump转储文件写入磁盘上的本地文件,而不是不可搜索的文件描述符。