ActiveRecord OR 运算符使查询速度减慢了 10 倍。为什么？

Question

ActiveRecord OR 运算符使查询速度减慢了 10 倍。为什么？

D-N*_*ice 6 sql database postgresql activerecord ruby-on-rails

我有一个 ActiveRecord 查询，它使用 OR 运算符将 2 个查询链接在一起。结果恢复正常，但执行组合查询的速度是单独执行 2 个查询之一的速度的 10 倍左右。

我们有一个Event模型和一个Invitation模型。AUser可以Event通过邀请过滤器作为目标被邀请，或者通过Invitation记录被单独邀请。

因此，在确定有多少用户受邀参加特定活动时，我们必须查看Invitations所有符合筛选条件的用户以及所有符合筛选条件的用户。我们在这里这样做：

@invited_count = @invited_by_individual.or(@invited_by_filter).distinct.count(:id)

重要的是要注意，两者@invited_by_individual和@invited_by_filter关系都包含references和includes陈述。

现在，问题是当我们执行该查询时，大约需要 1200 毫秒。如果我们单独进行查询，每个查询只需要大约 80 毫秒。因此@invited_by_filter.distinct.count ，@invited_by_individual.distinct.count两者都在大约 80 毫秒内返回结果，但它们本身都不是完整的。

有什么办法可以用 OR 运算符加快查询速度吗？为什么会发生这种情况？

这是 ActiveRecord 查询生成的 SQL：

快速、单一的查询：

(79.7ms)  
SELECT COUNT(DISTINCT "users"."id") 
FROM "users" 
LEFT OUTER JOIN "invitations" 
ON "invitations"."user_id" = "users"."id" 
WHERE "invitations"."event_id" = $1  [["event_id", 732]]

Run Code Online (Sandbox Code Playgroud)

慢，结合查询：

(1220.7ms)  
SELECT COUNT(DISTINCT "users"."id") 
FROM "users" 
LEFT OUTER JOIN "invitations" 
ON "invitations"."user_id" = "users"."id" 
WHERE ("invitations"."event_id" = $1 OR "users"."organization_id" = $2)  [["event_id", 732], ["organization_id", 13]]

Run Code Online (Sandbox Code Playgroud)

更新，这里是解释：

(1418.2ms)  SELECT COUNT(DISTINCT "users"."id") FROM "users" LEFT OUTER JOIN "invitations" ON "invitations"."user_id" = "users"."id" WHERE ("users"."root_organization_id" = $1 OR "invitations"."event_id" = $2)  [["root_organization_id", -1], ["event_id", 749]]
 => 
EXPLAIN for: SELECT COUNT(DISTINCT "users"."id") FROM "users" LEFT OUTER JOIN "invitations" ON "invitations"."user_id" = "users"."id" WHERE ("users"."root_organization_id" = $1 OR "invitations"."event_id" = $2) [["root_organization_id", -1], ["event_id", 749]]

 #=> QUERY PLAN
                                                     
 Aggregate  (cost=121781.56..121781.57 rows=1 width=8)
   ->  Hash Right Join  (cost=113248.88..121778.64 rows=1165 width=8)
         Hash Cond: (invitations.user_id = users.id)
         Filter: ((users.root_organization_id = '-1'::integer) OR (invitations.event_id = 749))
         ->  Seq Scan on invitations  (cost=0.00..1299.70 rows=63470 width=8)
         ->  Hash  (cost=93513.28..93513.28 rows=1135328 width=12)
               ->  Seq Scan on users  (cost=0.00..93513.28 rows=1135328 width=12)
(7 rows)

Run Code Online (Sandbox Code Playgroud)

更新 2，针对单独运行的查询的解释，确实使用了索引：

(91.5ms)  SELECT COUNT(*) FROM "users" INNER JOIN "invitations" ON "invitations"."user_id" = "users"."id" WHERE "users"."root_organization_id" = $1  [["root_organization_id", -1]]
 => 
EXPLAIN for: SELECT COUNT(*) FROM "users" INNER JOIN "invitations" ON "invitations"."user_id" = "users"."id" WHERE "users"."root_organization_id" = $1 [["root_organization_id", -1]]

 #=> QUERY PLAN

 Aggregate  (cost=19.05..19.06 rows=1 width=8)
   ->  Nested Loop  (cost=0.72..19.05 rows=1 width=0)
         ->  Index Scan using index_users_on_root_organization_id on users  (cost=0.43..4.45 rows=1 width=8)
               Index Cond: (root_organization_id = '-1'::integer)
         ->  Index Only Scan using index_invitations_on_user_id on invitations  (cost=0.29..14.57 rows=3 width=4)
               Index Cond: (user_id = users.id)
(6 rows)

Run Code Online (Sandbox Code Playgroud)

和

EXPLAIN for: SELECT COUNT(DISTINCT "users"."id") FROM "users" LEFT OUTER JOIN "invitations" ON "invitations"."user_id" = "users"."id" WHERE "invitations"."event_id" = $1 [["event_id", 749]]

 #=> QUERY PLAN

 Aggregate  (cost=536.34..536.35 rows=1 width=8)
   ->  Nested Loop  (cost=0.72..536.19 rows=62 width=8)
         ->  Index Scan using index_invitations_on_event_id on invitations  (cost=0.29..11.98 rows=62 width=4)
               Index Cond: (event_id = 749)
         ->  Index Only Scan using users_pkey on users  (cost=0.43..8.45 rows=1 width=8)
               Index Cond: (id = invitations.user_id)
(6 rows)

Run Code Online (Sandbox Code Playgroud)

Answer 1

Lam*_*han 1

UNION使您能够利用两个索引，同时仍然防止重复。

User.from(
"(#{@invited_by_individual.to_sql} 
UNION 
#{@invited_by_filter.to_sql})"
).count

Run Code Online (Sandbox Code Playgroud)

归档时间：	4 年，9 月前
查看次数：	342 次
最近记录：	4 年，8 月前