我用 python 构建了一个绘图仪表板,随着时间的推移显示多个变量。其中一个变量在这里称为“颜色”,我想按它对结果图进行排序。
import pandas as pd
import plotly.express as px
import string
import random
import numpy as np
# for the color mapping
color_dict = {"Colors": {
"green": "rgb(0, 255, 0)",
"black": "rgb(0, 0, 0)",
"red": "rgb(255, 0, 0)"
}}
# creating the df
random.seed(30)
letters = list(string.ascii_lowercase)[0:20]
data = {"letters":letters}
df = pd.DataFrame(data)
df = pd.DataFrame(np.repeat(df.values, 3, axis=0), columns=df.columns) # repeat each row 2 times
df['attempt'] = np.where(df.index%2==0, 1, 2) # adds either 1 or 2 in column "attempts"
lst = ['2022_10_10', '2022_10_11', '2022_10_12']
N = len(df)
df["date"] = pd.Series(np.tile(lst, N//len(lst))).iloc[:N] # add date column with 3 repeating dates
df["colors"] = random.choices(["green", "black", "red"], k=len(df)) # add randomly the colors
df.head()
#letters attempt date colors
#0 a 1 2022_10_10 black
#1 a 2 2022_10_11 green
#2 a 1 2022_10_12 green
#3 b 2 2022_10_10 black
#4 b 1 2022_10_11 green
# the plot
fig = px.scatter(
df,
x="date",
y="letters",
symbol="attempt",
opacity=0.8,
color="colors",
color_discrete_map=color_dict["Colors"],
width=1000,
height=800,
)
fig.update_layout(
yaxis={
"type": "category",
"showgrid": False,
},
xaxis={
"type": "category",
"showgrid": False,
},
)
fig
Run Code Online (Sandbox Code Playgroud)
然而,由于原始 df (我假设?)经历了一些 groupby 等绘图,我的预绘图排序(我尝试了 sort_values、自定义排序函数等)似乎没有影响。因此,我想创建额外的列“黑色”,“绿色”,“红色”,用于保存黑色/绿色/红色出现的频率,例如在“a”行上。
df["black"] = ""
df["red"] = ""
df["green"] = ""
#letters attempt date colors black red green
#0 a 1 2022_10_10 black 1 0 2
#1 a 2 2022_10_11 green 1 0 2
#2 a 1 2022_10_12 green 1 0 2
#3 b 2 2022_10_10 black
#4 b 1 2022_10_11 green
Run Code Online (Sandbox Code Playgroud)
所以我的问题是:a)如何将颜色计数值放入这些列中?b) 如何使用“绿色”、“红色”、“黑色”列的值对绘图 y 轴的顺序进行排序?
谢谢!
编辑:抱歉,我知道这是一项复杂的任务。但我只是在寻找一种对 y 轴上的整行进行排序/排序的方法。必须保持行(例如行“a”)内的顺序。
EDIT2:我附上了我正在寻找的结果的(严重制造的)图像:
编辑帖子答案:如果有人想要执行这种类型的行排序,但只考虑最新日期(x 轴上的最后一列),您可以调整计数状态函数,如下所示:
color_counts = (
df[df["date"] == df["date"].max()]
.groupby('letters')['colors']
.value_counts()
.unstack(fill_value=0)
)
Run Code Online (Sandbox Code Playgroud)
让我们添加每个字母的颜色数量:
color_counts = (
df.groupby('letters')['colors']
.value_counts()
.unstack(fill_value=0)
)
df = df.merge(color_counts, on='letters')
Run Code Online (Sandbox Code Playgroud)
修改后的数据帧的前5条记录:
color_counts = (
df.groupby('letters')['colors']
.value_counts()
.unstack(fill_value=0)
)
df = df.merge(color_counts, on='letters')
Run Code Online (Sandbox Code Playgroud)
为了将black, green,red计数添加到绘图中,我们必须包含如下内容hover_data:px.scatter
fig = px.scatter(
...
hover_data=['black','red','green'],
...
)
Run Code Online (Sandbox Code Playgroud)
要沿 y 轴对字母进行排序,我们可以使用以下方式category_orders的参数px.scatter:
color_counts = (
df.groupby('letters')['colors']
.value_counts()
.unstack(fill_value=0)
)
def sort_letters(color):
return color_counts[color].sort_values().index
desired_color = 'black'
fig = px.scatter(
...
category_orders={'letters': sort_letters(desired_color)},
...
)
Run Code Online (Sandbox Code Playgroud)
在这里,我们传递字母的确切顺序,以及它们应如何沿轴定位。
import pandas as pd
import numpy as np
import plotly.express as px
from string import ascii_lowercase
import random
# Color mapping
color_map = {
"green": "YellowGreen",
"black": "Black",
"red": "Crimson"
}
# Mocking data
random.seed(30)
num_letters = 20
letters = [*ascii_lowercase][:num_letters]
dates = ['2022_10_10', '2022_10_11', '2022_10_12']
df = pd.DataFrame(np.repeat(letters, len(dates)), columns=['letters'])
df['attempt'] = df.index % 2 + 1
df['date'] = np.tile(dates, num_letters)
df['colors'] = random.choices([*color_map], k=len(df))
color_counts = (
df.groupby('letters')['colors']
.value_counts()
.unstack(fill_value=0)
)
df = df.merge(color_counts, on='letters')
#display(df.sample(5))
def sort_letters(color):
return color_counts[color].sort_values().index
# The plot
desired_color = 'black'
fig = px.scatter(
df,
x="date",
y="letters",
category_orders={'letters':sort_letters(desired_color)},
color="colors",
color_discrete_map=color_map,
symbol='attempt',
symbol_map={1:'circle-open',2:'circle'},
hover_data=[*color_map],
opacity=0.8,
width=800,
height=800,
title=f"Sorted by {desired_color} counts"
)
fig.update_layout(
yaxis={
"type": "category",
"showgrid": False,
},
xaxis={
"type": "category",
"showgrid": False,
},
)
fig
Run Code Online (Sandbox Code Playgroud)
要按颜色计数排序(如字典顺序),我们可以使用原始帖子中的代码,并在末尾添加以下内容:
color_counts = (
df.groupby('letters')['colors']
.value_counts()
.unstack(fill_value=0)
)
fig.update_yaxes(
categoryorder="array",
categoryarray=color_counts.sort_values(by=['green','red','black']).index,
)
Run Code Online (Sandbox Code Playgroud)
这是输出:
另请参见Plotly:分类轴
| 归档时间: |
|
| 查看次数: |
333 次 |
| 最近记录: |