如何使用plotly express更改条形图中每个特定条形的颜色?

pat*_*998 3 python charts dataframe plotly plotly-python

我正在使用 python 和plotly 来设计我正在使用的数据集中某些类别的平均评级的条形图。我得到的条形图几乎是我想要的,但是我想更改图中每个特定条形的颜色,但似乎无法找到如何在线执行此操作的明确方法。

数据集

from pandas import Timestamp
pd.DataFrame({'id': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5},
              
 'overall_rating': {0: 5, 1: 4, 2: 5, 3: 5, 4: 4},
 'user_name': {0: 'member1365952',
  1: 'member465943',
  2: 'member665924',
  3: 'member865886',
  4: 'member1065873'},
 'date': {0: Timestamp('2022-02-03 00:00:00'),
  1: Timestamp('2022-02-03 00:00:00'),
  2: Timestamp('2022-02-02 00:00:00'),
  3: Timestamp('2022-02-01 00:00:00'),
  4: Timestamp('2022-02-01 00:00:00')},
 'comments': {0: 'Great campus. Library is always helpful. Sport course has been brill despite Civid challenges.',
  1: 'Average facilities and student Union. Great careers support.',
  2: 'Brilliant university, very social place with great unions.',
  3: 'Overall it was very good and the tables and chairs for discussion sessions worked very well',
  4: 'Uni is nice and most of the staff are amazing. Facilities (particularly the library) could be better'},
 'campus_facilities_rating': {0: 5, 1: 3, 2: 5, 3: 4, 4: 4},
 'clubs_societies_rating': {0: 5, 1: 3, 2: 4, 3: 4, 4: 4},
 'students_union_rating': {0: 4, 1: 3, 2: 5, 3: 5, 4: 5},
 'careers_service_rating': {0: 5, 1: 5, 2: 5, 3: 5, 4: 3},
 'wifi_rating': {0: 5, 1: 5, 2: 5, 3: 5, 4: 3}})
Run Code Online (Sandbox Code Playgroud)

使用的代码

# Plot to find mean rating for different categories
fig = px.bar(df, y=[df.campus_facilities_rating.mean(), df.clubs_societies_rating.mean(),
                    df.students_union_rating.mean(), df.careers_service_rating.mean(), df.wifi_rating.mean()],
                x=['Campus Facilities', 'Clubs & Societies', 'Students Union', 'Careers & Services', 'Wifi'],
                labels={
                    "y": "Mean Rating (1-5)",
                    "x": "Category"},
                title="Mean Rating For Different Student Categories")

fig.show()
Run Code Online (Sandbox Code Playgroud)

更新的尝试

# Plot to find mean rating for different categories
fig = px.bar(df, y=[df.campus_facilities_rating.mean(), df.clubs_societies_rating.mean(),
                    df.students_union_rating.mean(), df.careers_service_rating.mean(), df.wifi_rating.mean()],
                x=['Campus Facilities', 'Clubs & Societies', 'Students Union', 'Careers & Services', 'Wifi'],
                labels={
                    "y": "Mean Rating (1-5)",
                    "x": "Category"},
                title="Mean Rating For Different Student Categories At The University of Lincoln",
                color_discrete_map = {
                    'Campus Facilities' : 'red',
                    'Clubs & Societies' : 'blue',
                    'Students Union' : 'pink',
                    'Careers & Services' : 'grey',
                    'Wifi' : 'orange'})

fig.update_layout(barmode = 'group')

fig.show()
Run Code Online (Sandbox Code Playgroud)

输出只是将所有条形显示为蓝色。

ves*_*and 6

一般来说,如果您定义了如下所示的类别,则可以使用color_discrete_mapinpx.bar()来指定每个条形的颜色:color="medal"

color_discrete_map={'gold':'yellow', 'silver':'grey', 'bronze':'brown'}
Run Code Online (Sandbox Code Playgroud)

阴谋:

在此输入图像描述

带有数据样本的一般方法的完整代码:

import plotly.express as px

long_df = px.data.medals_long()

fig = px.bar(long_df, x="nation", y="count", color="medal", title="color_discrete_map={'gold':'yellow', 'silver':'grey', 'bronze':'brown'}",
            color_discrete_map={'gold':'yellow', 'silver':'grey', 'bronze':'brown'})

fig.update_layout(barmode = 'group')

fig.show()
Run Code Online (Sandbox Code Playgroud)

OP 提供数据样本后编辑

对于您的特定数据集和结构,您无法直接应用color='category,因为不同的类别分布在多个列中,如下所示:

在此输入图像描述

有一种方法可以使用go.Figure()和来实现您的目标fig.add_traces(),但由于您似乎对 最感兴趣px.bar(),因此我们将坚持使用plotly.express。简而言之,go.Figure()不需要特定的数据争论就能得到你想要的东西,但设置数字会有点混乱。当涉及到plotly.express和 时px.bar,情况恰恰相反。一旦我们对您的数据集进行了一些更改,您只需使用以下代码片段即可构建下图:

fig = px.bar(dfg, x = 'category', y = 'value',
             color = 'category',
             category_orders = {'category':['Campus Facilities','Clubs & Societies','Students Union','Careers & Services','Wifi']},
             color_discrete_map = {'Campus Facilities' : 'red',
                                    'Clubs & Societies' : 'blue',
                                    'Students Union' : 'pink',
                                    'Careers & Services' : 'grey',
                                    'Wifi' : 'orange'})
Run Code Online (Sandbox Code Playgroud)

在此输入图像描述

包含所有数据整理步骤的完整代码:

from pandas import Timestamp
import plotly.express as px
import pandas as pd
df = pd.DataFrame({'id': {0: 1, 1: 2, 2: 3, 3: 4, 4: 5},
              
 'overall_rating': {0: 5, 1: 4, 2: 5, 3: 5, 4: 4},
 'user_name': {0: 'member1365952',
  1: 'member465943',
  2: 'member665924',
  3: 'member865886',
  4: 'member1065873'},
 'date': {0: Timestamp('2022-02-03 00:00:00'),
  1: Timestamp('2022-02-03 00:00:00'),
  2: Timestamp('2022-02-02 00:00:00'),
  3: Timestamp('2022-02-01 00:00:00'),
  4: Timestamp('2022-02-01 00:00:00')},
 'comments': {0: 'Great campus. Library is always helpful. Sport course has been brill despite Civid challenges.',
  1: 'Average facilities and student Union. Great careers support.',
  2: 'Brilliant university, very social place with great unions.',
  3: 'Overall it was very good and the tables and chairs for discussion sessions worked very well',
  4: 'Uni is nice and most of the staff are amazing. Facilities (particularly the library) could be better'},
 'campus_facilities_rating': {0: 5, 1: 3, 2: 5, 3: 4, 4: 4},
 'clubs_societies_rating': {0: 5, 1: 3, 2: 4, 3: 4, 4: 4},
 'students_union_rating': {0: 4, 1: 3, 2: 5, 3: 5, 4: 5},
 'careers_service_rating': {0: 5, 1: 5, 2: 5, 3: 5, 4: 3},
 'wifi_rating': {0: 5, 1: 5, 2: 5, 3: 5, 4: 3}})

df.columns = ['id', 'overall_rating', 'user_name', 'date', 'comments', 'Campus Facilities',
              'Clubs & Societies','Students Union','Careers & Services','Wifi']

dfm = pd.melt(df, id_vars=['id', 'overall_rating', 'user_name', 'date', 'comments'],
              value_vars=list(df.columns[5:]),
              var_name ='category')

dfg = dfm.groupby(['category']).mean().reset_index()

fig = px.bar(dfg, x = 'category', y = 'value', color = 'category',
             category_orders = {'category':['Campus Facilities','Clubs & Societies','Students Union','Careers & Services','Wifi']},
             color_discrete_map = {
                    'Campus Facilities' : 'red',
                    'Clubs & Societies' : 'blue',
                    'Students Union' : 'pink',
                    'Careers & Services' : 'grey',
                    'Wifi' : 'orange'})

fig.update_yaxes(title = 'Mean rating (1-5)')
fig.show()
Run Code Online (Sandbox Code Playgroud)

附录:为什么dfmdfg

px.bar(color = 'variable')为名为 的系列或 pandas 列的唯一出现分配颜色'variable'。但我们对您的数据框感兴趣的类别分布在多个列中。所以呢

dfm = pd.melt(df, id_vars=['id', 'overall_rating', 'user_name', 'date', 'comments'],
              value_vars=list(df.columns[5:]),
              var_name ='category')
Run Code Online (Sandbox Code Playgroud)

就是取以下列:

在此输入图像描述

并将它们堆叠成一列,命名variable如下:

在此输入图像描述

但这仍然是原始数据,您对此不感兴趣,而是对同一列中每个组的平均值感兴趣。这就是

dfm.groupby(['category']).mean().reset_index()
Run Code Online (Sandbox Code Playgroud)

给我们:

在此输入图像描述

请查看pd.melt()df.groupby()了解更多详细信息。