Prz*_*min 4 m powerquery powerbi
如何在Power BI的M中选择每个类别具有最大值的行。假设我们有表:
+----------+-------+------------+
| Category | Value | Date |
+----------+-------+------------+
| apples | 1 | 2018-07-01 |
| apples | 2 | 2018-07-02 |
| apples | 3 | 2018-07-03 |
| bananas | 7 | 2018-07-04 |
| bananas | 8 | 2018-07-05 |
| bananas | 9 | 2018-07-06 |
+----------+-------+------------+
Run Code Online (Sandbox Code Playgroud)
所需的结果是:
+----------+-------+------------+
| Category | Value | Date |
+----------+-------+------------+
| apples | 3 | 2018-07-03 |
| bananas | 9 | 2018-07-06 |
+----------+-------+------------+
Run Code Online (Sandbox Code Playgroud)
这是PBI的开始表:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Category", type text}, {"Value", Int64.Type}, {"Date", type date}})
in
#"Changed Type"
Run Code Online (Sandbox Code Playgroud)
我想知道是否有办法通过添加一些魔术列IsMax在仅一张表中的后续步骤中获得期望的结果:
+----------+-------+------------+-------+
| Category | Value | Date | IsMax |
+----------+-------+------------+-------+
| apples | 1 | 2018-07-01 | 0 |
| apples | 2 | 2018-07-02 | 0 |
| apples | 3 | 2018-07-03 | 1 |
| bananas | 7 | 2018-07-04 | 0 |
| bananas | 8 | 2018-07-05 | 0 |
| bananas | 9 | 2018-07-06 | 1 |
+----------+-------+------------+-------+
Run Code Online (Sandbox Code Playgroud)
在Power Query编辑器中进行基本的分组依据(分组依据Category并采用max over Value)可以获取此表:
+----------+-------+
| Category | Value |
+----------+-------+
| apples | 3 |
| bananas | 9 |
+----------+-------+
Run Code Online (Sandbox Code Playgroud)
向该表中添加一个IsMax简单的值作为自定义列,1然后将其与(和左外部联接)合并,并将其与同时匹配于Category和的原始表Value。最后,展开IsMax列以获取所需的表,但使用null代替0。您可以null选择替换这些值。
这是所有这些步骤的M代码:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Changed Type" = Table.TransformColumnTypes(Source,{{"Value", Int64.Type}, {"Date", type date}, {"Category", type text}}),
#"Grouped Rows" = Table.Group(#"Changed Type", {"Category"}, {{"Value", each List.Max([Value]), Int64.Type}}),
#"Added Custom" = Table.AddColumn(#"Grouped Rows", "IsMax", each 1, Int64.Type),
#"Merged Queries" = Table.NestedJoin(#"Changed Type",{"Category", "Value"},#"Added Custom",{"Category", "Value"},"Added Custom",JoinKind.LeftOuter),
#"Expanded Added Custom" = Table.ExpandTableColumn(#"Merged Queries", "Added Custom", {"IsMax"}, {"IsMax"})
in
#"Expanded Added Custom"
Run Code Online (Sandbox Code Playgroud)
我最终MAX通过了每个类别index。这里描述的想法:https ://stackoverflow.com/a/51498237/1903793
方法#1是 R 转换中的一行代码:
library(dplyr)
output <- dataset %>% group_by(Category) %>% mutate(row_no_by_category = row_number(desc(Date)))
Run Code Online (Sandbox Code Playgroud)
方法#2,完全在 PBI 中完成:
let
Source = Table.FromRows(Json.Document(Binary.Decompress(Binary.FromText("i45WSiwoyEktVtJRMgRiIwNDC10Dc10DQ6VYHSQ5I2Q5I1Q5Y2Q5Y7BcUmIeEIIkzZElTdAkLZAlTdEkLZElzZRiYwE=", BinaryEncoding.Base64), Compression.Deflate)), let _t = ((type text) meta [Serialized.Text = true]) in type table [Category = _t, Value = _t, Date = _t]),
#"Grouped rows" = Table.Group(Source, {"Category"}, {{"NiceTable", each Table.AddIndexColumn(Table.Sort(_,{{"Date", Order.Descending}} ), "Index",1,1), type table}} ),
#"Expanded NiceTable" = Table.ExpandTableColumn(#"Grouped rows", "NiceTable", {"Value", "Date", "Index"}, {"Value", "Date", "Index"}),
#"Filtered Rows" = Table.SelectRows(#"Expanded NiceTable", each ([Index] = 1))
in
#"Filtered Rows"
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
15159 次 |
| 最近记录: |