Dra*_*mmy 7 sql t-sql sql-server sql-server-2008-r2
这真的是一个难题.它之前可能已被问过,但我找不到任何东西,所以我想我会分享这个问题.
我正在尝试在应用程序中实现某种负载平衡,并将问题简化为我认为应该是一个简单的TSQL练习(该应用程序主要在SQL Server域(SQL Server 2008 R2)中).
基本上我有一个有两个整数的表; 唯一的,顺序的Id和非唯一的值.该表可以包含任意数量的记录,我想生成一个数据表,其中前n个最大值被分成单独的"分组",然后第二组n个最大值被分成单独的"分组".
我在下面有一份初稿,但我相信它可以改进......
DECLARE @GroupCount INT = 5
-- Set up the test data
DECLARE @test TABLE (Id INT IDENTITY(1, 1), Value INT)
INSERT @Test (Value)
VALUES (100), (456), (121), (402), (253), (872), (765), (6529), (1029), (342), (98), (1), (0), (4), (46), (23), (456), (416), (2323), (4579)
--Order by Value descending
;WITH cte AS
(
SELECT *
,ROW_NUMBER() OVER (ORDER BY Value DESC) RowNum
FROM @Test
)
--use modulus to split into grouping
, cte2 AS
(
SELECT *
,ROW_NUMBER() OVER (PARTITION BY RowNum % @GroupCount ORDER BY RowNum DESC) Rnk
FROM cte
)
SELECT ROW_NUMBER() OVER (PARTITION BY Rnk ORDER BY Value DESC) AS 'Grouping'
,Value
,Id
FROM cte2
ORDER BY [Grouping], Value ASC
Run Code Online (Sandbox Code Playgroud)
这适用于并生成以下数据集:
Grouping, Value, Id
======== ===== ==
1 46 15
1 342 10
1 765 7
1 6529 8
2 23 16
2 253 5
2 456 2
2 4579 20
3 4 14
3 121 3
3 456 17
3 2323 19
4 1 12
4 100 1
4 416 18
4 1029 9
5 0 13
5 98 11
5 402 4
5 872 6
Run Code Online (Sandbox Code Playgroud)
返回的数据集是正确的,因为前n个最大值被拆分为单独的分组,依此类推,但每个分组中的总值在分组1中与分组5(例如)相比完全不同.
分组和SUMmed时,我们可以看到非均匀传播:
Grouping, SummedValues
======== ============
1 7682
2 5311
3 2904
4 1546
5 1372
Run Code Online (Sandbox Code Playgroud)
在尽可能少的行中,如何更好地平衡值,以便每个分组中的总值更均匀地分布?
这是有缺陷的,但对于给定的示例数据来说并不可怕。你的旅费可能会改变。
declare @groupcount int = 5;
create table t (id int identity(1, 1), value int);
insert t values
(100),(456),(121),(402),(253),(872),(765),(6529),(1029),(342)
, (98),(1),(0),(4),(46),(23),(456),(416),(2323),(4579);
;with cte as (
select *
, rn = row_number() over (order by value asc)
, pct = value/sum(value+.0) over()
, target = 1.0 / @groupcount
from t
)
, remaining as (
select id, value, rn
, grp = convert(int,(sum(value) over (order by rn)/sum(value+.0) over())*@groupCount)+1
from cte
)
select
grp = row_number() over (order by sum(value) desc)
, sumValue = sum(value)
from remaining
group by grp
Run Code Online (Sandbox Code Playgroud)
rextester 演示:http://rextester.com/UNV61100
结果:
+-----+----------+
| grp | sumValue |
+-----+----------+
| 1 | 6529 |
| 2 | 4579 |
| 3 | 3483 |
| 4 | 2323 |
| 5 | 1901 |
+-----+----------+
Run Code Online (Sandbox Code Playgroud)
declare @groupcount int = 5;
create table t (id int identity(1, 1), value int);
insert t values
(100),(456),(121),(402),(253),(872),(765),(6529),(1029),(342)
, (98),(1),(0),(4),(46),(23),(456),(416),(2323),(4579);
;with cte as (
select *
, rn = row_number() over (order by value asc)
, pct = value/tv.TotalValue
, target = 1.0 / @groupcount
from t
cross join (select TotalValue = sum(value+.0) from t) tv
)
, remaining as (
select id, value, rn
, grp = convert(int,((x.sumValueOver/TotalValue)*@groupcount)+1)
from cte
outer apply (
select sumValueOver = sum(value)
from cte i
where i.rn <= cte.rn
) x
)
select
grp = row_number() over (order by sum(value) desc)
, sumValue = sum(value)
from remaining
group by grp
Run Code Online (Sandbox Code Playgroud)
rextester 演示:http://rextester.com/DEUDJ77007
返回:
+-----+----------+
| grp | sumValue |
+-----+----------+
| 1 | 6529 |
| 2 | 4579 |
| 3 | 3483 |
| 4 | 2323 |
| 5 | 1901 |
+-----+----------+
Run Code Online (Sandbox Code Playgroud)