ActiveRecord查询联盟

Lan*_*opp 82 union activerecord ruby-on-rails active-relation

我用Ruby on Rail的查询接口写了几个复杂的查询(至少对我来说):

watched_news_posts = Post.joins(:news => :watched).where(:watched => {:user_id => id})
watched_topic_posts = Post.joins(:post_topic_relationships => {:topic => :watched}).where(:watched => {:user_id => id})
Run Code Online (Sandbox Code Playgroud)

这两个查询都可以自行完成.两者都返回Post对象.我想将这些帖子合并到一个ActiveRelation中.由于某些时候可能有数十万个帖子,因此需要在数据库级别完成.如果是MySQL查询,我可以简单地使用UNION运算符.有没有人知道我是否可以用RoR的查询界面做类似的事情?

Tim*_*ore 89

这是我编写的一个快速小模块,允许您使用UNION多个范围.它还将结果作为ActiveRecord :: Relation的实例返回.

module ActiveRecord::UnionScope
  def self.included(base)
    base.send :extend, ClassMethods
  end

  module ClassMethods
    def union_scope(*scopes)
      id_column = "#{table_name}.id"
      sub_query = scopes.map { |s| s.select(id_column).to_sql }.join(" UNION ")
      where "#{id_column} IN (#{sub_query})"
    end
  end
end
Run Code Online (Sandbox Code Playgroud)

这是要点:https://gist.github.com/tlowrimore/5162327

编辑:

根据要求,这是UnionScope如何工作的示例:

class Property < ActiveRecord::Base
  include ActiveRecord::UnionScope

  # some silly, contrived scopes
  scope :active_nearby,     -> { where(active: true).where('distance <= 25') }
  scope :inactive_distant,  -> { where(active: false).where('distance >= 200') }

  # A union of the aforementioned scopes
  scope :active_near_and_inactive_distant, -> { union_scope(active_nearby, inactive_distant) }
end
Run Code Online (Sandbox Code Playgroud)

  • 快速警告:从MySQL的性能角度来看,这种方法存在很大问题,因为子查询将被视为依赖并为表中的每条记录执行(请参阅http://www.percona.com/blog/2010/10/25/MySQL的-限制部分-3-子查询/). (7认同)
  • 这真的是一种更完整的方式来回答上面列出的其他人.效果很好! (2认同)
  • 解决方案"几乎"正确,我给它+1,但我遇到了一个我修复的问题:https://gist.github.com/lsiden/260167a4d3574a580d97 (2认同)

Ell*_*son 59

我也遇到过这个问题,现在我的首选策略是生成SQL(手动或to_sql在现有范围内使用),然后将其粘贴在from子句中.我不能保证它比你接受的方法更有效,但它在眼睛上相对容易并且给你一个正常的ARel对象.

watched_news_posts = Post.joins(:news => :watched).where(:watched => {:user_id => id})
watched_topic_posts = Post.joins(:post_topic_relationships => {:topic => :watched}).where(:watched => {:user_id => id})

Post.from("(#{watched_news_posts.to_sql} UNION #{watched_topic_posts.to_sql}) AS posts")
Run Code Online (Sandbox Code Playgroud)

您也可以使用两个不同的模型执行此操作,但是您需要确保它们在UNION中"看起来相同" - 您可以select在两个查询上使用它们以确保它们将生成相同的列.

topics = Topic.select('user_id AS author_id, description AS body, created_at')
comments = Comment.select('author_id, body, created_at')

Comment.from("(#{comments.to_sql} UNION #{topics.to_sql}) AS comments")
Run Code Online (Sandbox Code Playgroud)


Lan*_*opp 11

根据Olives的回答,我确实想出了另一个解决这个问题的方法.它感觉有点像黑客,但它返回一个实例ActiveRelation,这是我在第一时间所追求的.

Post.where('posts.id IN 
      (
        SELECT post_topic_relationships.post_id FROM post_topic_relationships
          INNER JOIN "watched" ON "watched"."watched_item_id" = "post_topic_relationships"."topic_id" AND "watched"."watched_item_type" = "Topic" WHERE "watched"."user_id" = ?
      )
      OR posts.id IN
      (
        SELECT "posts"."id" FROM "posts" INNER JOIN "news" ON "news"."id" = "posts"."news_id" 
        INNER JOIN "watched" ON "watched"."watched_item_id" = "news"."id" AND "watched"."watched_item_type" = "News" WHERE "watched"."user_id" = ?
      )', id, id)
Run Code Online (Sandbox Code Playgroud)

如果任何人有任何建议来优化或提高性能,我仍然会感激,因为它实际上是执行三个查询并感觉有点多余.


小智 9

怎么样...

def union(scope1, scope2)
  ids = scope1.pluck(:id) + scope2.pluck(:id)
  where(id: ids.uniq)
end
Run Code Online (Sandbox Code Playgroud)

  • 请注意,这将执行三个查询而不是一个查询,因为每个`pluck`调用本身就是一个查询. (14认同)
  • 这是一个非常好的解决方案,因为它不会返回一个数组,所以你可以使用`.order`或`.paginate`方法...它保留了orm类 (3认同)

dgi*_*rez 7

您还可以使用Brian Hempelactive_record_union gem,它ActiveRecord使用union范围方法扩展.

您的查询将是这样的:

Post.joins(:news => :watched).
  where(:watched => {:user_id => id}).
  union(Post.joins(:post_topic_relationships => {:topic => :watched}
    .where(:watched => {:user_id => id}))
Run Code Online (Sandbox Code Playgroud)

希望这最终会在ActiveRecord某一天合并.


Oli*_*ves 6

你能用OR而不是UNION吗?

然后你可以这样做:

Post.joins(:news => :watched, :post_topic_relationships => {:topic => :watched})
.where("watched.user_id = :id OR topic_watched.user_id = :id", :id => id)
Run Code Online (Sandbox Code Playgroud)

(因为你加入了两次监视的表,我不太清楚查询表的名称是什么)

由于存在大量连接,因此数据库上也可能非常繁重,但它可能可以进行优化.

  • 很抱歉这么晚才回复你,但过去几天我一直在度假.我尝试你的答案时遇到的问题是连接方法导致两个表被连接,而不是两个单独的查询,然后可以进行比较.但是,你的想法是合理的,确实给了我另一个想法.谢谢您的帮助. (2认同)

ric*_*sun 5

可以说,这提高了可读性,但不一定是性能:

def my_posts
  Post.where <<-SQL, self.id, self.id
    posts.id IN 
    (SELECT post_topic_relationships.post_id FROM post_topic_relationships
    INNER JOIN watched ON watched.watched_item_id = post_topic_relationships.topic_id 
    AND watched.watched_item_type = "Topic" 
    AND watched.user_id = ?
    UNION
    SELECT posts.id FROM posts 
    INNER JOIN news ON news.id = posts.news_id 
    INNER JOIN watched ON watched.watched_item_id = news.id 
    AND watched.watched_item_type = "News" 
    AND watched.user_id = ?)
  SQL
end
Run Code Online (Sandbox Code Playgroud)

此方法返回一个ActiveRecord :: Relation,因此您可以这样调用它:

my_posts.order("watched_item_type, post.id DESC")
Run Code Online (Sandbox Code Playgroud)