use*_*373 5 python numpy pandas seaborn
我有一个dataFrame,它有多列和多行.很多行没有列的值,所以在数据框中它表示为NaN.示例dataFrame如下,
df.head()
GEN Sample_1 Sample_2 Sample_3 Sample_4 Sample_5 Sample_6 Sample_7 Sample_8 Sample_9 Sample_10 Sample_11 Sample_12 Sample_13 Sample_14
A123 9.4697 3.19689 4.8946 8.54594 13.2568 4.93848 3.16809 NAN NAN NAN NAN NAN NAN NAN
A124 6.02592 4.0663 3.9218 2.66058 4.38232 NAN NAN NAN NAN NAN NAN NAN
A125 7.88999 2.51576 4.97483 5.8901 21.1346 5.06414 15.3094 2.68169 8.12449 NAN NAN NAN NAN NAN
A126 5.99825 10.2186 15.2986 7.53729 4.34196 8.75048 16.9358 5.52708 NAN NAN NAN NAN NAN NAN
A127 28.5014 4.86702 NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN NAN
Run Code Online (Sandbox Code Playgroud)
我想使用python的seaborn函数绘制这个dataFrame的直方图,所以我尝试了以下几行,
sns.set(color_codes=True)
sns.set(style="white", palette="muted")
sns.distplot(df)
Run Code Online (Sandbox Code Playgroud)
但它抛出以下错误,
ValueError Traceback (most recent call last)
<ipython-input-80-896d7fe85ef3> in <module>()
1 sns.set(color_codes=True)
2 sns.set(style="white", palette="muted")
----> 3 sns.distplot(df)
/anaconda3/lib/python3.4/site-packages/seaborn/distributions.py in distplot(a, bins, hist, kde, rug, fit, hist_kws, kde_kws, rug_kws, fit_kws, color, vertical, norm_hist, axlabel, label, ax)
210 hist_color = hist_kws.pop("color", color)
211 ax.hist(a, bins, orientation=orientation,
--> 212 color=hist_color, **hist_kws)
213 if hist_color != color:
214 hist_kws["color"] = hist_color
/anaconda3/lib/python3.4/site-packages/matplotlib/axes/_axes.py in hist(self, x, bins, range, normed, weights, cumulative, bottom, histtype, align, orientation, rwidth, log, color, label, stacked, **kwargs)
5627 color = mcolors.colorConverter.to_rgba_array(color)
5628 if len(color) != nx:
-> 5629 raise ValueError("color kwarg must have one color per dataset")
5630
5631 # We need to do to 'weights' what was done to 'x'
ValueError: color kwarg must have one color per dataset
Run Code Online (Sandbox Code Playgroud)
任何帮助/建议摆脱这个错误将不胜感激.. !!!
我还认为seaborn文档提到可以同时绘制多个列,并默认以颜色突出显示。
但重读后,我什么也没看到。相反,我认为我是从本教程中推断出来的,在其中的一部分过程中,本教程绘制了一个具有多列的数据框。
然而,“解决方案”是微不足道的,希望正是您正在寻找的:
sns.set(color_codes=True)
sns.set(style="white", palette="muted")
sns.distplot(df)
for col_id in df.columns:
sns.distplot(df[col_id])
Run Code Online (Sandbox Code Playgroud)
默认情况下,这将改变颜色,“知道”已经使用了哪种颜色。
注意:我使用了不同的数据集,因为我不确定如何重新创建您的数据集。
我遇到了类似的问题,因为我的pandas.DataFrame在我想要绘制的列(my_column)中具有Object类型的元素。这样命令:
print(df[my_column])
Run Code Online (Sandbox Code Playgroud)
给我:
Length: 150, dtype: object
Run Code Online (Sandbox Code Playgroud)
解决方案是
sns.distplot(df[my_column].astype(float))
Run Code Online (Sandbox Code Playgroud)
由于my_column的数据类型转换为:
Length: 150, dtype: float64
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
20933 次 |
| 最近记录: |