嗨,我们正在 Amazon RDS 中运行一个 PostgreSQL 9.6 数据库,使用 m4.large(2cpu 8gb) 和 1000 的预置 IOPS。用例如下:我们有一个包含数百万个注册表(或多或少 4M)的表和我们创建了一个物化视图,其中包含该表的一个子集 (2M aprox),更改了一些列类型以提高查询效率。我们的 pg_conf 没有改变,是 RDS Postgres 的默认设置。
这是我们的视图定义:
CREATE MATERIALIZED VIEW public.customers_mv as
SELECT
id,
gender,
contact_info,
location,
social,
categories,
(social ->> 'follower_count')::integer AS social_follower_count,
(social ->> 'following_count')::integer AS social_following_count,
(social ->> 'peemv')::float AS social_emv,
(social ->> 'engagement')::float AS social_engagement,
(social ->> 'v')::boolean AS social_validated,
search_vector,
flags,
to_tsvector('english',concat_ws(' ','aal0_'||(customers.location ->> 'aal0'),
'aal1_'||(customers.location ->> 'aal1'),
'aal2_'||(customers.location ->> 'aal2'),
'frequent_location_aal0_'||(customers.location -> 'frequent_location' ->> 'aal0'),
'frequent_location_aal1_'||(customers.location -> 'frequent_location' ->> …
Run Code Online (Sandbox Code Playgroud) 出于某种原因,Postgres 似乎没有使用我们创建的索引。这是我正在测试的查询:
SELECT "public"."influencers".*
FROM "public"."influencers"
WHERE (ig -> 'id' @> '"4878142508"')
LIMIT 1
Run Code Online (Sandbox Code Playgroud)
运行后EXPLAIN
:
-> Seq Scan on influencers (cost=0.00..32800.14 rows=216 width=1110)
Run Code Online (Sandbox Code Playgroud)
这表明(如我所见)没有使用索引。
这是我们创建的数据库和索引:
CREATE TABLE public.influencers
(
id integer NOT NULL DEFAULT nextval('influencers_id_seq'::regclass),
location jsonb,
gender text COLLATE pg_catalog."default",
birthdate timestamp without time zone,
ig jsonb,
contact_info jsonb,
created_at timestamp without time zone DEFAULT now(),
updated_at timestamp without time zone DEFAULT now(),
categories text[] COLLATE pg_catalog."default",
search_field text COLLATE pg_catalog."default",
search_vector tsvector,
ig_updated_at timestamp without time …
Run Code Online (Sandbox Code Playgroud) 我们有一个包含大约 50 万行的表。数据库表应该增长到数百万条记录。
这是表的样子:
CREATE TABLE public.influencers
(
id integer NOT NULL DEFAULT nextval('influencers_id_seq'::regclass),
location jsonb,
gender text COLLATE pg_catalog."default",
birthdate timestamp without time zone,
ig jsonb,
contact_info jsonb,
created_at timestamp without time zone DEFAULT now(),
updated_at timestamp without time zone DEFAULT now(),
categories text[] COLLATE pg_catalog."default",
search_field text COLLATE pg_catalog."default",
search_vector tsvector,
ig_updated_at timestamp without time zone,
CONSTRAINT influencers_pkey PRIMARY KEY (id),
CONSTRAINT ig_id_must_exist CHECK (ig ? 'id'::text),
CONSTRAINT ig_username_must_exist CHECK (ig ? 'username'::text)
)
Run Code Online (Sandbox Code Playgroud)
这些是我们需要高效执行的一些查询:
SELECT "public"."influencers".*
FROM …
Run Code Online (Sandbox Code Playgroud)