BigQuery:加权平均值

sha*_*ama 3 google-bigquery

桌子:

| User_ID |  Red | Blue | Green |  Rating |
|   a     |   23 |  33  |   42  |    99   |
|   a     |   56 |  45  |   62  |    45   |
|   a     |   23 |  49  |   28  |    67   |
|   b     |   39 |  59  |   10  |    87   |
|   b     |   18 |  28  |   59  |    38   |
|   b     |   40 |  50  |   38  |    94   |
Run Code Online (Sandbox Code Playgroud)

我想要获得的结果是 user_id 的不同行,具有基于评级列的红色、蓝色和绿色的加权平均值。

颜色 * 评级/(a 或 b 的评级总和)

//编辑

我不知道如何做到这一点。尝试了以下方法,但这是一次徒劳的尝试

   WITH
      averages AS (
      SELECT
        User_ID,
        SUM(rating) AS average
      FROM
`       project.dataset.table` 
      GROUP BY
        1)
    SELECT
      averages.User_ID,
      Red*(Rating/average),
      Blue*(rating/average),
      Green*(rating/average)
    FROM
      `project.dataset.table` a
    LEFT JOIN
      averages
    ON
      a.user_id = averages.user_id 
Run Code Online (Sandbox Code Playgroud)

Mar*_*ann 10

我明白了——这更像是一个数学问题。您将值与其权重相乘,然后不除以计数,而是除以权重总和。每个组的所有内容(用户 ID)。你可以尝试类似的东西SELECT SUM(x * weight) / SUM(weight) FROM table GROUP BY ...

WITH t AS (SELECT * FROM 
  UNNEST([
    STRUCT('a' AS userID, 23 AS red, 99 AS weight),
    STRUCT('a' AS userID, 56 AS red, 45 AS weight),
    STRUCT('a' AS userID, 23 AS red, 67 AS weight),
    STRUCT('b' AS userID, 39 AS red, 87 AS weight),
    STRUCT('b' AS userID, 18 AS red, 38 AS weight),
    STRUCT('b' AS userID, 40 AS red, 94 AS weight)
  ])
  )

SELECT
  userID,
  SUM(red*weight) / SUM(weight) weightedAvg,
  AVG(red) normalAvg
FROM
  t
GROUP BY
  userID
Run Code Online (Sandbox Code Playgroud)

哈!