如何将排序移动到数据库级别

Bra*_*cil 10 postgresql performance ruby-on-rails

我有一个Rails应用程序,它使用postgresql作为数据库,按位置对不同类型的用户进行排序,然后根据网站上各种活动收到的信誉点进行排序.这是一个示例查询

 @lawyersbylocation = User.lawyers_by_province(province).sort_by{ |u| -u.total_votes }
Run Code Online (Sandbox Code Playgroud)

该查询在User.rb模型上调用范围lawyers_by_province:

 scope :lawyers_by_province, lambda {|province|
  joins(:contact).
  where( contacts: {province_id: province},
         users: {lawyer: true})

  }
Run Code Online (Sandbox Code Playgroud)

然后,仍然在User.rb模型上,它计算他们拥有的声誉点.

 def total_votes
    answerkarma = AnswerVote.joins(:answer).where(answers: {user_id: self.id}).sum('value') 
    contributionkarma = Contribution.where(user_id: self.id).sum('value')
    bestanswer = BestAnswer.joins(:answer).where(answers: {user_id: self.id}).sum('value') 
    answerkarma + contributionkarma + bestanswer
 end
Run Code Online (Sandbox Code Playgroud)

我被告知如果网站达到一定数量的用户,那么它将变得非常慢,因为它在Ruby而不是在数据库级别进行排序.我知道注释是指total_votes方法,但我不确定lawyers_by_province是在数据库级别还是在ruby中发生,因为它使用Rails帮助程序来查询数据库.看起来像两者兼而有之,但我不确定这对效率的影响.

你能告诉我如何写这个,以便查询发生在数据库级别,因此以更有效的方式不会破坏我的网站?

更新以下是total_votes方法中模型的三种方案.

 create_table "answer_votes", force: true do |t|
    t.integer  "answer_id"
    t.integer  "user_id"
    t.integer  "value"
    t.boolean  "lawyervote"
    t.boolean  "studentvote"
    t.datetime "created_at"
    t.datetime "updated_at"
  end

  add_index "answer_votes", ["answer_id"], name: "index_answer_votes_on_answer_id", using: :btree
  add_index "answer_votes", ["lawyervote"], name: "index_answer_votes_on_lawyervote", using: :btree
  add_index "answer_votes", ["studentvote"], name: "index_answer_votes_on_studentvote", using: :btree
  add_index "answer_votes", ["user_id"], name: "index_answer_votes_on_user_id", using: :btree



create_table "best_answers", force: true do |t|
    t.integer  "answer_id"
    t.integer  "user_id"
    t.integer  "value"
    t.datetime "created_at"
    t.datetime "updated_at"
    t.integer  "question_id"
  end

  add_index "best_answers", ["answer_id"], name: "index_best_answers_on_answer_id", using: :btree
  add_index "best_answers", ["user_id"], name: "index_best_answers_on_user_id", using: :btree



create_table "contributions", force: true do |t|
    t.integer  "user_id"
    t.integer  "answer_id"
    t.integer  "value"
    t.datetime "created_at"
    t.datetime "updated_at"
  end

  add_index "contributions", ["answer_id"], name: "index_contributions_on_answer_id", using: :btree
  add_index "contributions", ["user_id"], name: "index_contributions_on_user_id", using: :btree
Run Code Online (Sandbox Code Playgroud)

此外,这里是联系方案,其中包含user.rb模型中lawyers_by_province范围中使用的province_id

  create_table "contacts", force: true do |t|
    t.string   "firm"
    t.string   "address"
    t.integer  "province_id"
    t.string   "city"
    t.string   "postalcode"
    t.string   "mobile"
    t.string   "office"
    t.integer  "user_id"
    t.string   "website"
    t.datetime "created_at"
    t.datetime "updated_at"
  end
Run Code Online (Sandbox Code Playgroud)

更新尝试应用@Shawn的答案,我将此方法放在user.rb模型中

 def self.total_vote_sql
    "( " +
    [
     AnswerVote.joins(:answer).select("user_id, value"),
     Contribution.select("user_id, value"),
     BestAnswer.joins(:answer).select("user_id, value")
    ].map(&:to_sql) * " UNION ALL " + 
    ") as total_votes "
  end
Run Code Online (Sandbox Code Playgroud)

然后在控制器中,我这样做了(放在User前面total_vote_sql)

@lawyersbyprovince = User.select("users.*, sum(total_votes.value) as total_votes").joins("left outer join #{User.total_vote_sql} on users.id = total_votes.user_id").
                            order("total_votes desc").lawyers_by_province(province)
Run Code Online (Sandbox Code Playgroud)

它给了我这个错误

ActiveRecord::StatementInvalid in LawyerProfilesController#index

PG::Error: ERROR: column reference "user_id" is ambiguous LINE 1: ..."user_id" = "users"."id" left outer join ( SELECT user_id, v... ^ : SELECT users.*, sum(total_votes.value) as total_votes FROM "users" INNER JOIN "contacts" ON "contacts"."user_id" = "users"."id" left outer join ( SELECT user_id, value FROM "answer_votes" INNER JOIN "answers" ON "answers"."id" = "answer_votes"."answer_id" UNION ALL SELECT user_id, value FROM "contributions" UNION ALL SELECT user_id, value FROM "best_answers" INNER JOIN "answers" ON "answers"."id" = "best_answers"."answer_id") as total_votes on users.id = total_votes.user_id WHERE "contacts"."province_id" = 6 AND "users"."lawyer" = 't' ORDER BY total_votes desc
Run Code Online (Sandbox Code Playgroud)

更新在对Shawn的帖子应用编辑后,错误消息现在是这样的:

PG::Error: ERROR: column reference "user_id" is ambiguous LINE 1: ..."user_id" = "users"."id" left outer join ( SELECT user_id as... ^ : SELECT users.*, sum(total_votes.value) as total_votes FROM "users" INNER JOIN "contacts" ON "contacts"."user_id" = "users"."id" left outer join ( SELECT user_id as tv_user_id, value FROM "answer_votes" INNER JOIN "answers" ON "answers"."id" = "answer_votes"."answer_id" UNION ALL SELECT user_id as tv_user_id, value FROM "contributions" UNION ALL SELECT user_id as tv_user_id, value FROM "best_answers" INNER JOIN "answers" ON "answers"."id" = "best_answers"."answer_id") as total_votes on users.id = total_votes.tv_user_id WHERE "contacts"."province_id" = 6 AND "users"."lawyer" = 't' ORDER BY total_votes desc
Run Code Online (Sandbox Code Playgroud)

Dav*_*dge 8

警告:我对Rails很陌生,但这是我保持理智的技术,同时由于性能原因需要不断直接进入数据库,我需要一直这样做,因为你只能有两个以下

  1. 处理批量数据
  2. 一种纯Rails技术
  3. 很好的表现

无论如何,一旦你需要进入这些混合方法,这是部分红宝石部分SQL,我觉得你可能会全力以赴并选择纯SQL解决方案.

  1. 它更容易调试,因为您可以更有效地隔离两个代码层.
  2. 优化SQL更容易,因为如果不是你的强点,你有更好的机会让专门的SQL人员为你查看它.

我认为你在这里寻找的SQL是这样的:

with cte_scoring as (
  select
    users.id,
    (select Coalesce(sum(value),0) from answer_votes  where answer_votes.user_id  = users.id) +
    (select Coalesce(sum(value),0) from best_answers  where best_answers.user_id  = users.id) +
    (select Coalesce(sum(value),0) from contributions where contributions.user_id = users.id) total_score
  from
    users join
    contacts on (contacts.user_id = users.id)
  where
    users.lawyer         = 'true'          and
    contacts.province_id = #{province.id})
select   id,
         total_score
from     cte_scoring
order by total_score desc
limit    #{limit_number}
Run Code Online (Sandbox Code Playgroud)

这应该给你最好的性能 - SELECT中的子查询并不理想,但该技术确实应用了你正在检查得分的user_id的过滤.

集成到Rails中:如果将sql_string定义为SQL代码:

scoring = ActiveRecord::Base.connection.execute sql_string
Run Code Online (Sandbox Code Playgroud)

...然后你会得到一系列哈希,你可以这样使用:

scoring.each do |lawyer_score|
  lawyer = User.find(lawyer_score["id"])
  score  = lawyer_score["total_score"]
  ...
end
Run Code Online (Sandbox Code Playgroud)