Ste*_*mer 2 python matplotlib pandas
我有一个由M 个日期的N列值组成的数据框。
我希望绘制每个日期 3 个最大值的堆积条形图。
测试数据框:
import pandas
import numpy
data = {
'A': [ 65, 54, 12, 14, 30, numpy.nan ],
'B': [ 54, 47, 60, 34, 40, 35 ],
'C': [ 34, 39, 57, 56, 48, numpy.nan ],
'D': [ 20, 18, 47, 47, 35, 70 ]
}
df = pandas.DataFrame(index=pandas.date_range('2018-01-01', '2018-01-06').date,
data=data,
dtype=numpy.float64)
Run Code Online (Sandbox Code Playgroud)
Run Code Online (Sandbox Code Playgroud)A B C D 2018-01-01 65.0 54.0 34.0 20.0 2018-01-02 54.0 47.0 39.0 18.0 2018-01-03 12.0 60.0 57.0 47.0 2018-01-04 14.0 34.0 56.0 47.0 2018-01-05 30.0 40.0 48.0 35.0 2018-01-06 NaN 35.0 NaN 70.0
提取每行的 3 个最大值:
我发现nlargest可以使用它来提取每行的 3 个最大列及其各自的值:
for date,row in df.iterrows():
top = row.nlargest(3)
s = [f'{c}={v}' for c,v in top.iteritems()]
print('{}: [ {} ]'.format(date, ', '.join(s)))
Run Code Online (Sandbox Code Playgroud)
Run Code Online (Sandbox Code Playgroud)2018-01-01: [ A=65.0, B=54.0, C=34.0 ] 2018-01-02: [ A=54.0, B=47.0, C=39.0 ] 2018-01-03: [ B=60.0, C=57.0, D=47.0 ] 2018-01-04: [ C=56.0, D=47.0, B=34.0 ] 2018-01-05: [ C=48.0, B=40.0, D=35.0 ] 2018-01-06: [ D=70.0, B=35.0 ]
在堆积条形图中绘制数据:
最后一步,获取上述数据并绘制堆积条形图,使其看起来像下面的示例,但我没有成功。
我什至不确定这是否nlargest是最好的方法。
期望的输出:
问题:
如何创建数据框中每行 N 个最大列的堆积条形图?
从您的输入开始df:
top3_by_date = (
# bring the date back as a column to use as a grouping var
df.reset_index()
# make a long DF of date/column/name value
.melt(id_vars='index')
# order DF by highest values first
.sort_values('value', ascending=False)
# group by the index and take the first 3 rows of each
.groupby('index')
.head(3)
# pivot back so we've got an X & Y to chart...
.pivot('index', 'variable')
# drop the value level as we don't need that
.droplevel(level=0, axis=1)
)
Run Code Online (Sandbox Code Playgroud)
这给出:
variable A B C D
index
2018-01-01 65.0 54.0 34.0 NaN
2018-01-02 54.0 47.0 39.0 NaN
2018-01-03 NaN 60.0 57.0 47.0
2018-01-04 NaN 34.0 56.0 47.0
2018-01-05 NaN 40.0 48.0 35.0
2018-01-06 NaN 35.0 NaN 70.0
Run Code Online (Sandbox Code Playgroud)
然后你可以这样做top3_by_date.plot.bar(stacked=True),这应该会给你类似的东西:
| 归档时间: |
|
| 查看次数: |
1550 次 |
| 最近记录: |