如何在mysql中查找此场景中的重复项和空白

Awa*_*rni 7 mysql select

嗨我有一张看起来像的桌子

-----------------------------------------------------------
|  id  |  group_id | source_id | target_id | sortsequence |
-----------------------------------------------------------
|  2   |    1      |    2      |   4       |     1        |   
-----------------------------------------------------------
|  4   |    1      |    20     |   2       |     1        |   
-----------------------------------------------------------
|  5   |    1      |    2      |   14      |     1        |   
-----------------------------------------------------------
|  7   |    1      |    2      |   7       |     3        |   
-----------------------------------------------------------
|  20  |    2      |    20     |   4       |     3        |   
-----------------------------------------------------------
|  21  |    2      |    20     |   4       |     1        |   
-----------------------------------------------------------
Run Code Online (Sandbox Code Playgroud)

脚本

有两种情况需要处理.

  1. Sortsequence列值应该是唯一的一个source_idgroup_id.例如,如果所有记录都group_id = 1 AND source_id = 2应该具有sortsequence唯一.在上面的示例记录中id= and 5 which are having group_id = 1 and source_id = 2 have same sortsequence which is 1.这是错误的记录.我需要找出这些记录.
  2. 如果group_id and source_id相同.的sortsequence columns value should be continous. There should be no gap.例如在上表中records having id = 20, 21 having same group_id and source_id and sortsequence value is 3 and 1.即使这是独一无二的,但在sortsequence值上存在差距.我还需要找出这些记录.

我的努力

我写了一个查询

SELECT source_id,`group_id`,GROUP_CONCAT(id) AS children 
FROM
    table 
GROUP BY source_id,
  sortsequence,
  `group_id` 
 HAVING COUNT(*) > 1 
Run Code Online (Sandbox Code Playgroud)

此查询仅解决方案1.如何处理方案2?有没有办法在同一个查询中执行它,或者我必须写其他来处理第二个场景.

By the way query will be dealing with million of records in table so performance must be very good.

Awa*_*rni 1

从评论中得到答案Tere J。以下查询涵盖了上述两个条件。

 SELECT 
     source_id, `group_id`, GROUP_CONCAT(id) AS faultyIDS    
 FROM
     table
 GROUP BY
     source_id,group_id 
 HAVING
     COUNT(DISTINCT sortsequence) <> COUNT(sortsequence) OR COUNT(sortsequence) <> MAX(sortsequence) OR MIN(sortsequence) <> 1
Run Code Online (Sandbox Code Playgroud)

也许它可以帮助其他人。