Rails复杂查询根据真值表计算唯一记录

Vic*_*tor 8 mysql sql ruby-on-rails

使用Rails.我有以下代码:

class TypeOfBlock < ActiveRecord::Base
  has_and_belongs_to_many :patients
end

class Patient < ActiveRecord::Base
  has_and_belongs_to_many :type_of_blocks, dependent: :destroy
end
Run Code Online (Sandbox Code Playgroud)

使用这些表格:

????????????????
?type_of_blocks?
????????????????
? id   ? name  ?
????????????????
?  1   ? UP    ?
?  2   ? LL    ?
?  3   ? T     ?
????????????????

?????????????????????????????????
?    patients_type_of_blocks    ? 
?????????????????????????????????
? type_of_block_id ? patient_id ?
?????????????????????????????????
?                1 ?          1 ?
?                1 ?          2 ?
?                2 ?          2 ?
?                3 ?          3 ?
?                2 ?          4 ?
?                1 ?          5 ?
?                1 ?          6 ?
?                2 ?          6 ?
?                3 ?          6 ?
????????????????????????????????? 
Run Code Online (Sandbox Code Playgroud)

我想计算独特患者的数量取决于块组合的类型,这是预期的结果:

# Expected results (just like a truth table)
UP (patient with type_of_block_id 1 only) = 2 patient
UP + LL (patient with type_of_block_ids 1 and 2) = 1 patient
UP + T (patient with type_of_block_ids 1 and 3) = 0 patient
LL (patient with type_of_block_id 2 only) = 1 patient
LL + T (patient with type_of_block_ids 2 and 3) = 0 patient
T (patient with type_of_block_id 3 only) = 1 patient
UP + LL + T (patient with type_of_block_ids 1, 2 and 3) = 1 patient
Run Code Online (Sandbox Code Playgroud)

我试过加入如下表格:

up_ll =
  TypeOfBlock.
    joins("join patients_type_of_blocks on patients_type_of_blocks.type_of_block_id = type_of_blocks.id").
    where("patients_type_of_blocks.type_of_block_id = 1 and patients_type_of_blocks.type_of_block_id = 2").
    size
Run Code Online (Sandbox Code Playgroud)

但是有太多的复杂性,而且这个数字是错误的.我想尝试原始SQL,但Rails 4弃用它并要求我这样做ModelClass.find_by_sql.

如何生成上述预期结果?

Bor*_*aMa 5

这使我想到的唯一的解决方案是使用原始的SQL,并充分利用group_concat功能,如图所示这里.

所需的SQL是这样的:

SELECT
  combination,
  count(*) as cnt
FROM (
       SELECT
         ptb.patient_id,
         group_concat(tb.name ORDER BY tb.name) AS combination
       FROM type_of_blocks tb
       INNER JOIN patients_type_of_blocks ptb ON ptb.type_of_block_id = tb.id
       GROUP BY ptb.patient_id) patient_combinations
GROUP BY combination;
Run Code Online (Sandbox Code Playgroud)

患者的内部选择组,并选择每个患者具有的块类型的组合.然后外部选择简单地计算每种组合中的患者.

该查询返回以下内容(请参阅SQL小提琴):

combination     cnt
LL              1
LL,T,UP         1
LL,UP           1
T               1
UP              2
Run Code Online (Sandbox Code Playgroud)

正如您所看到的,查询不会返回零计数,这必须在ruby代码中解决(可能使用零的所有组合初始化哈希,然后与查询计数合并).

要将此查询集成到ruby,只需find_by_sql在任何模型上使用该方法(例如将结果转换为哈希):

sql = <<-EOF
        ...the query from above...
        EOF

TypeOfBlock.find_by_sql(sql).to_a.reduce({}) { |h, u| h[u.combination] = u.cnt; h }
# => { "LL" => 1, "LL,T,UP" => 1, "LL,UP" => 1, "T" => 1, "UP" => 2 }
Run Code Online (Sandbox Code Playgroud)

  • ****它可以扩展为仅使用MySQL处理零.如果有兴趣检查我的答案:) (3认同)

Luk*_*zda 5

BoraMa提供的答案是正确的.我只是想解决:

正如您所看到的,查询不会返回零计数,这必须在ruby代码中解决(可能使用零的所有组合初始化哈希,然后与查询计数合并).

它可以通过使用纯MySQL来实现:

SELECT sub.combination, COALESCE(cnt, 0) AS cnt
FROM (SELECT GROUP_CONCAT(Name ORDER BY Name SEPARATOR ' + ') AS combination
      FROM (SELECT p.Name, p.rn, LPAD(BIN(u.N + t.N * 10), size, '0') bitmap
            FROM (SELECT @rownum := @rownum + 1 rn, id, Name
                  FROM type_of_blocks, (SELECT @rownum := 0) r) p
            CROSS JOIN (SELECT 0 N UNION ALL SELECT 1 
                    UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
                    UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7
                    UNION ALL SELECT 8 UNION ALL SELECT 9) u
             CROSS JOIN (SELECT 0 N UNION ALL SELECT 1 
                    UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4
                    UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7
                    UNION ALL SELECT 8 UNION ALL SELECT 9) t
             CROSS JOIN (SELECT COUNT(*) AS size FROM type_of_blocks) o
             WHERE u.N + t.N * 10 < POW(2, size)
             ) b
       WHERE SUBSTRING(bitmap, rn, 1) = '1'
       GROUP BY bitmap
) AS sub
LEFT JOIN (
    SELECT combination, COUNT(*) AS cnt
    FROM (SELECT ptb.patient_id,
                GROUP_CONCAT(tb.name ORDER BY tb.name SEPARATOR ' + ') AS combination
          FROM type_of_blocks tb
          JOIN patients_type_of_blocks ptb 
            ON ptb.type_of_block_id = tb.id
          GROUP BY ptb.patient_id) patient_combinations
    GROUP BY combination   
) AS sub2
  ON sub.combination = sub2.combination
ORDER BY LENGTH(sub.combination), sub.combination; 
Run Code Online (Sandbox Code Playgroud)

SQLFiddleDemo

输出:

??????????????????????
? combination  ? cnt ?
??????????????????????
? T            ?   1 ?
? LL           ?   1 ?
? UP           ?   2 ?
? LL + T       ?   0 ?
? T + UP       ?   0 ?
? LL + UP      ?   1 ?
? LL + T + UP  ?   1 ?
??????????????????????
Run Code Online (Sandbox Code Playgroud)

这个怎么运作:

  1. 使用Serpiton描述的方法生成所有可能的组合(略有改进)
  2. 计算可用的组合
  3. 结合两个结果

为了更好地理解它如何工作Postgresql生成所有cominations的版本:

WITH all_combinations AS (
    SELECT string_agg(b.Name ,' + ' ORDER BY b.Name) AS combination
    FROM (SELECT p.Name, p.rn, RIGHT(o.n::bit(16)::text, size) AS bitmap
          FROM (SELECT *, ROW_NUMBER() OVER(ORDER BY id)::int AS  rn
                FROM type_of_blocks )AS p
          CROSS JOIN generate_series(1, 100000) AS o(n)     
          ,LATERAL(SELECT COUNT(*)::int AS size FROM type_of_blocks) AS s
          WHERE o.n < 2 ^ size
         ) b
    WHERE SUBSTRING(b.bitmap, b.rn, 1) = '1'
    GROUP BY b.bitmap
)
SELECT sub.combination, COALESCE(sub2.cnt, 0) AS cnt
FROM all_combinations sub
LEFT JOIN (SELECT combination, COUNT(*) AS cnt
           FROM (SELECT ptb.patient_id,
                 string_agg(tb.name,' + ' ORDER BY tb.name) AS combination
                 FROM type_of_blocks tb
                 JOIN patients_type_of_blocks ptb 
                   ON ptb.type_of_block_id = tb.id
                 GROUP BY ptb.patient_id) patient_combinations
           GROUP BY combination) AS sub2
  ON sub.combination = sub2.combination
ORDER BY LENGTH(sub.combination), sub.combination; 
Run Code Online (Sandbox Code Playgroud)

SqlFiddleDemo2