使用SQL Server的排行榜设计

Mar*_*tin 10 sql database sql-server database-design azure-sql-database

我正在为我的一些在线游戏构建排行榜.以下是我需要处理的数据:

  • 在多个时间范围内(今天,上周,所有时间等)获取给定游戏的玩家等级
  • 获得分页排名(例如最近24小时的最高分,获得排名25到50之间的玩家,获得排名或单个用户)

我使用下面的表定义和索引定义,我有几个问题.

考虑到我的场景,我有一个好的主键吗?我在gameId,playerName和得分上拥有聚类键的原因仅仅是因为我想确保给定游戏的所有数据都在同一区域并且该得分已经排序.大多数时候,我将显示数据是给定gameId的得分降序(+ updatedDateTime for ties).这是一个正确的策略吗?换句话说,我想确保我可以运行我的查询以尽可能快地获得我的玩家的排名.

CREATE TABLE score (
    [gameId]            [smallint] NOT NULL,
    [playerName]        [nvarchar](50) NOT NULL,
    [score]             [int] NOT NULL,
    [createdDateTime]   [datetime2](3) NOT NULL,
    [updatedDateTime]   [datetime2](3) NOT NULL,
PRIMARY KEY CLUSTERED ([gameId] ASC, [playerName] ASC, [score] DESC, [updatedDateTime] ASC)

CREATE NONCLUSTERED INDEX [Score_Idx] ON score ([gameId] ASC, [score] DESC, [updatedDateTime] ASC) INCLUDE ([playerName])
Run Code Online (Sandbox Code Playgroud)

下面是我将用于获得我的玩家等级的查询的第一次迭代.但是,我对执行计划感到有点失望(见下文).为什么SQL需要排序?额外的排序似乎来自RANK功能.但是我的数据是否已按降序排序(基于得分表的聚类键)?我也想知道我是否应该更多地规范化我的表并移出Player表中的PlayerName列.我最初决定将所有内容保存在同一个表中,以最大限度地减少连接数.

DECLARE @GameId AS INT = 0
DECLARE @From AS DATETIME2(3) = '2013-10-01'

SELECT DENSE_RANK() OVER (ORDER BY Score DESC), s.PlayerName, s.Score, s.CountryCode, s.updatedDateTime
FROM [mrgleaderboard].[score] s
WHERE s.GameId = @GameId 
  AND (s.UpdatedDateTime >= @From OR @From IS NULL)
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

感谢您的帮助!

Ale*_*exK 7

[更新]

主键不好

您有一个独特的实体,即[GameID] + [PlayerName].复合聚簇索引> 120字节,带nvarchar.在相关主题SQL Server - 字典的聚簇索引设计中查找@marc_s的答案

您的表架构与您的要求与时间段不匹配

例如:我在周三获得了300分,这个分数存储在排行榜上.第二天我获得了250分,但它不会在排行榜上记录,如果我对星期二排行榜运行查询,你就不会得到结果

有关完整信息,您可以从历史表中获得游戏得分但可能非常昂贵

CREATE TABLE GameLog (
  [id]                int NOT NULL IDENTITY
                      CONSTRAINT [PK_GameLog] PRIMARY KEY CLUSTERED,
  [gameId]            smallint NOT NULL,
  [playerId]          int NOT NULL,
  [score]             int NOT NULL,
  [createdDateTime]   datetime2(3) NOT NULL)
Run Code Online (Sandbox Code Playgroud)

以下是加速与聚合相关的解决方案:

  • 索引视图历史表(见职位由@Twinkles).

您需要3个时间段的3个索引视图.可能巨大的历史表和3索引视图.无法删除表格的"旧"句点.保存分数的性能问题.

  • 异步排行榜

分数保存在历史表中.SQL作业/"工作人员"(或几个)根据计划(每分钟1个?)对历史表进行排序并使用预先计算的用户等级填充排行榜表(3个时间段的3个表或带有时间段键的一个表).此表也可以非规范化(具有得分,日期时间,播放器名称和...).优点:快速阅读(无需排序),快速保存分数,任何时间段,灵活的逻辑和灵活的时间表.缺点:用户已完成游戏,但未在排行榜上立即找到

  • 预先集中的排行榜

在录制期间,游戏会话的结果进行预处理.在你的情况下,类似于UPDATE [Leaderboard] SET score = @CurrentScore WHERE @CurrentScore > MAX (score) AND ...玩家/游戏ID,但你只为"所有时间"排行榜.该计划可能如下所示:

CREATE TABLE [Leaderboard] (
    [id]                int NOT NULL IDENTITY
                             CONSTRAINT [PK_Leaderboard] PRIMARY KEY CLUSTERED,
    [gameId]            smallint NOT NULL,
    [playerId]          int NOT NULL,
    [timePeriod]        tinyint NOT NULL,   -- 0 -all time, 1-monthly, 2 -weekly, 3 -daily
    [timePeriodFrom]    date NOT NULL,  -- '1900-01-01' for all time, '2013-11-01' for monthly, etc.
    [score]             int NOT NULL,
    [createdDateTime]   datetime2(3) NOT NULL
    )
Run Code Online (Sandbox Code Playgroud)
playerId    timePeriod  timePeriodFrom  Score
----------------------------------------------
1           0           1900-01-01      300  
...
1           1           2013-10-01      150
1           1           2013-11-01      300
...
1           2           2013-10-07      150
1           2           2013-11-18      300
...
1           3           2013-11-19      300
1           3           2013-11-20      250
...

因此,您必须更新所有时间段的所有3分.此外,您可以看到排行榜将包含"旧"期间,例如10月份的月份.如果您不需要此统计信息,可能必须删除它.优点:不需要历史表.缺点:存储结果的复杂过程.需要维护排行榜.查询需要排序和JOIN

CREATE TABLE [Player] (
    [id]    int NOT NULL IDENTITY CONSTRAINT [PK_Player] PRIMARY KEY CLUSTERED,
    [playerName]        nvarchar(50) NOT NULL CONSTRAINT [UQ_Player_playerName] UNIQUE NONCLUSTERED)

CREATE TABLE [Leaderboard] (
    [id]                int NOT NULL IDENTITY CONSTRAINT [PK_Leaderboard] PRIMARY KEY CLUSTERED,
    [gameId]            smallint NOT NULL,
    [playerId]          int NOT NULL,
    [timePeriod]        tinyint NOT NULL,   -- 0 -all time, 1-monthly, 2 -weekly, 3 -daily
    [timePeriodFrom]    date NOT NULL,  -- '1900-01-01' for all time, '2013-11-01' for monthly, etc.
    [score]             int NOT NULL,
    [createdDateTime]   datetime2(3) 
)

CREATE UNIQUE NONCLUSTERED INDEX [UQ_Leaderboard_gameId_playerId_timePeriod_timePeriodFrom] ON [Leaderboard] ([gameId] ASC, [playerId] ASC, [timePeriod]  ASC,  [timePeriodFrom] ASC)
CREATE NONCLUSTERED INDEX [IX_Leaderboard_gameId_timePeriod_timePeriodFrom_Score] ON [Leaderboard] ([gameId] ASC, [timePeriod]  ASC,  [timePeriodFrom] ASC, [score] ASC)
GO

-- Generate test data
-- Generate 500K unique players
;WITH digits (d) AS (SELECT 1 UNION SELECT 2 UNION SELECT 3 UNION
   SELECT 4 UNION SELECT 5 UNION SELECT 6 UNION SELECT 7 UNION SELECT 8 UNION SELECT 9 UNION SELECT 0)

INSERT INTO Player (playerName)
SELECT TOP (500000) LEFT(CAST(NEWID() as nvarchar(50)), 20 + (ABS(CHECKSUM(NEWID())) & 15)) as Name
FROM   digits CROSS JOIN digits ii CROSS  JOIN digits iii CROSS  JOIN digits iv CROSS  JOIN digits v CROSS  JOIN digits vi

-- Random score 500K players * 4 games = 2M rows
INSERT INTO [Leaderboard] (
    [gameId],[playerId],[timePeriod],[timePeriodFrom],[score],[createdDateTime])
SELECT  GameID, Player.id,ABS(CHECKSUM(NEWID())) & 3 as [timePeriod], DATEADD(MILLISECOND, CHECKSUM(NEWID()),GETDATE()) as Updated, ABS(CHECKSUM(NEWID())) & 65535 as score
    , DATEADD(MILLISECOND, CHECKSUM(NEWID()),GETDATE()) as Created
FROM (  SELECT 1 as GameID  UNION ALL SELECT 2  UNION ALL SELECT 3  UNION ALL SELECT 4) as Game
    CROSS JOIN Player
ORDER BY NEWID()
UPDATE [Leaderboard] SET [timePeriodFrom]='19000101' WHERE [timePeriod] = 0
GO

DECLARE @From date = '19000101'--'20131108'
    ,@GameID int = 3
    ,@timePeriod tinyint = 0

-- Get paginated ranking 
;With Lb as (
SELECT 
    DENSE_RANK() OVER (ORDER BY Score DESC) as Rnk
    ,Score, createdDateTime, playerId
FROM [Leaderboard]
WHERE GameId = @GameId
  AND [timePeriod] = @timePeriod
  AND [timePeriodFrom] = @From)

SELECT lb.rnk,lb.Score, lb.createdDateTime, lb.playerId, Player.playerName
FROM Lb INNER JOIN Player ON lb.playerId = Player.id
ORDER BY rnk OFFSET 75 ROWS FETCH NEXT 25 ROWS ONLY;

-- Get rank of a player for a given game 
SELECT (SELECT COUNT(DISTINCT rnk.score) 
        FROM [Leaderboard] as rnk 
        WHERE rnk.GameId = @GameId 
            AND rnk.[timePeriod] = @timePeriod
            AND rnk.[timePeriodFrom] = @From
            AND rnk.score >= [Leaderboard].score) as rnk
    ,[Leaderboard].Score, [Leaderboard].createdDateTime, [Leaderboard].playerId, Player.playerName
FROM [Leaderboard]  INNER JOIN Player ON [Leaderboard].playerId = Player.id
where [Leaderboard].GameId = @GameId
    AND [Leaderboard].[timePeriod] = @timePeriod
    AND [Leaderboard].[timePeriodFrom] = @From
    and Player.playerName = N'785DDBBB-3000-4730-B'
GO
Run Code Online (Sandbox Code Playgroud)

这只是提出想法的一个例子.它可以进行优化.例如,通过字典表将列GameID,TimePeriod,TimePeriodDate组合到一列.该指数的有效性将更高.

PS抱歉我的英文.随意修复语法或拼写错误