postgres 段错误并返回 SQLSTATE 08006

Dan*_*iel 5 postgresql

我在运行一些自动化测试时看到 Postgres 出现以下错误:

2020-03-06 23:32:57,051 WARN  main c.z.h.p.ProxyConnection - HikariPool-2 - Connection org.postgresql.jdbc.PgConnection@42e3ede4 marked as broken because of SQLSTATE(08006), ErrorCode(0) {}
 org.postgresql.util.PSQLException: An I/O error occurred while sending to the backend.
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:335)
    at org.postgresql.jdbc.PgStatement.executeInternal(PgStatement.java:441)
    at org.postgresql.jdbc.PgStatement.execute(PgStatement.java:365)
    at org.postgresql.jdbc.PgPreparedStatement.executeWithFlags(PgPreparedStatement.java:143)
    at org.postgresql.jdbc.PgPreparedStatement.executeQuery(PgPreparedStatement.java:106)
    at com.zaxxer.hikari.pool.ProxyPreparedStatement.executeQuery(ProxyPreparedStatement.java:52)
    at com.zaxxer.hikari.pool.HikariProxyPreparedStatement.executeQuery(HikariProxyPreparedStatement.java)
    at org.jooq.tools.jdbc.DefaultPreparedStatement.executeQuery(DefaultPreparedStatement.java:94)
    at org.jooq.impl.AbstractDMLQuery.execute(AbstractDMLQuery.java:738)
    at org.jooq.impl.AbstractQuery.execute(AbstractQuery.java:350)
    at org.jooq.impl.InsertImpl.fetchOne(InsertImpl.java:1061)
...
 Caused by: java.io.EOFException
    at org.postgresql.core.PGStream.receiveChar(PGStream.java:308)
    at org.postgresql.core.v3.QueryExecutorImpl.processResults(QueryExecutorImpl.java:1952)
    at org.postgresql.core.v3.QueryExecutorImpl.execute(QueryExecutorImpl.java:308)
    ... 53 more
2020-03-06 23:32:57,067 DEBUG main i.p.d.HikariPostgresDataSourceFactory - Connecting to jdbc:postgresql://localhost:5432/REDACTED as REDACTED {}
2020-03-06 23:32:57,093 ERROR main c.z.h.p.HikariPool - HikariPool-14 - Exception during pool initialization. {}
 org.postgresql.util.PSQLException: FATAL: the database system is in recovery mode
    at org.postgresql.Driver$ConnectThread.getResult(Driver.java:405)
    at org.postgresql.Driver.connect(Driver.java:263)
Run Code Online (Sandbox Code Playgroud)

看看dmesg我发现有一个段错误正在发生:

[1383242.997083] postgres[7998]: segfault at 100000048 ip 000055c587913e4b sp 00007fffa492e6f0 error 4 in postgres[55c587424000+72d000]
Run Code Online (Sandbox Code Playgroud)

这是我获得的回溯gdb

Core was generated by `postgres: REDACTED REDACTED 127.0.0.1(49990) INSERT                          '.
Program terminated with signal SIGSEGV, Segmentation fault.
#0  0x000055c587913e4b in pfree ()
(gdb) bt
#0  0x000055c587913e4b in pfree ()
#1  0x000055c587687475 in ExecSetSlotDescriptor ()
#2  0x000055c58767eb61 in ExecConstraints ()
#3  0x000055c5876a0efc in ?? ()
#4  0x000055c5876a2085 in ?? ()
#5  0x000055c58767cd1b in standard_ExecutorRun ()
#6  0x000055c5877d22e5 in ?? ()
#7  0x000055c5877d2538 in ?? ()
#8  0x000055c5877d2855 in ?? ()
#9  0x000055c5877d3427 in PortalRun ()
#10 0x000055c5877cfeec in PostgresMain ()
#11 0x000055c5874ddd37 in ?? ()
#12 0x000055c58775a882 in PostmasterMain ()
#13 0x000055c5874df0e5 in main ()
Run Code Online (Sandbox Code Playgroud)

这是我的 Postgres 版本:

postgres=# select version();
                                                                   version                                                                   
---------------------------------------------------------------------------------------------------------------------------------------------
 PostgreSQL 11.7 (Ubuntu 11.7-1.pgdg16.04+1) on x86_64-pc-linux-gnu, compiled by gcc (Ubuntu 5.4.0-6ubuntu1~16.04.12) 5.4.0 20160609, 64-bit
Run Code Online (Sandbox Code Playgroud)

有人知道这是否是一个错误或者是否有解决方法?

Dan*_*iel 1

我启用了查询日志记录并设法找到了有问题的“插入”:

insert into "myschema"."mytable" ("custcode", "custcar", "custdob", "closed") values ('a33113f2-930c-47de-95a6-b9e07650468a', 'hellow world', '2020-02-02 01:00:00+00:00', 'f')
Run Code Online (Sandbox Code Playgroud)

这是“custdob”列上的分区表,具有以下分区:

\d+ mytable
                                                           Table "myschema.mytable"
   Column   |           Type           | Collation | Nullable |                Default                 | Storage  | Stats target | Description 
------------+--------------------------+-----------+----------+----------------------------------------+----------+--------------+-------------
 id         | bigint                   |           | not null | nextval('mytable_id_seq'::regclass)    | plain    |              | 
 custcode   | uuid                     |           | not null |                                        | plain    |              | 
 custcar    | character varying        |           | not null |                                        | extended |              | 
 custdob    | timestamp with time zone |           | not null |                                        | plain    |              | 
 closed     | boolean                  |           | not null | false                                  | plain    |              | 
Partition key: RANGE (custdob)
Partitions: mytable_201902_partition FOR VALUES FROM ('2019-02-01 00:00:00+00') TO ('2019-03-01 00:00:00+00'),
            mytable_201903_partition FOR VALUES FROM ('2019-03-01 00:00:00+00') TO ('2019-04-01 00:00:00+00'),
            mytable_201908_partition FOR VALUES FROM ('2019-08-02 00:00:00+00') TO ('2019-09-01 00:00:00+00'),
            mytable_202003_partition FOR VALUES FROM ('2020-03-01 00:00:00+00') TO ('2020-04-01 00:00:00+00'),
            mytable_202004_partition FOR VALUES FROM ('2020-04-01 00:00:00+00') TO ('2020-05-01 00:00:00+00'),
            mytable_000000_partition DEFAULT
Run Code Online (Sandbox Code Playgroud)

请注意,INSERT 想要在二月的分区中插入,但我的 CI 服务器中缺少该分区,因此它应该在 DEFAULT 分区中插入行。问题是,DEFAULT 分区有以下约束:

"mytable_partition_check" CHECK (custdob < '2019-08-02 00:00:00+00'::timestamp with time zone)
Run Code Online (Sandbox Code Playgroud)

因此,Postgres 似乎陷入了一个错误,因为当该约束存在时,它无法插入二月份的记录。如果我放弃这个约束并重新发出有问题的 INSERT,这次它会起作用。