Dud*_*ude 4 python numpy matplotlib
我正在尝试生成一个盒子图.这是我的代码,数据如下:
def loadData(fileName):
data = pd.read_csv(fileName, quotechar='"')
cols = data.columns.tolist()
cols = cols[1:] + [ cols[0] ]
data = data[cols]
return data.values
cols={}
cols['close/last']=0
cols['volumne']=1
cols['open']=2
cols['high']=3
cols['low']=4
cols['date']=5
fileName = 'microsoft.csv'
def boxplot():
data1 = loadData(fileName)
ithattr1 = cols['high']
ithattr2 = cols['close/last']
dataset1 = data1[:,ithattr1]
dataset2 = data1[:,ithattr2]
fig = plt.figure()
ax = fig.add_subplot(111)
ax.boxplot([dataset1,dataset2])
plt.show()
boxplot()
Run Code Online (Sandbox Code Playgroud)
数据是浮点数,由输出的print命令验证
<type 'float'>.在运行代码时,我收到以下错误(下面的完整堆栈跟踪)
AttributeError: 'numpy.ndarray' object has no attribute 'find'
我的数据(例如in dataset1)看起来像这样
[52.21 52.2 52.44 52.65 52.33 51.58 51.38 51.68 51.97 53.4163 54.07 53.1
52.85 53.28 53.485 54.4001 55.39 54.8 56.19 56.78 56.85 55.95 55.96 55.88
55.48 55.35 56.0 56.79 56.245 55.9 55.21 55.1 55.655 55.87 56.1 55.97
.........................................
27.54 27.66 28.02 28.05 27.97 28.19 28.13]
Run Code Online (Sandbox Code Playgroud)
输出data.shape=(756,)
2016/01/29,97.3400,64332440.0000,94.7900,97.3400,94.3500
2016/01/28,94.0900,55622370.0000,93.7900,94.5200,92.3900
2016/01/27,93.4200,133059000.0000,96.0400,96.6289,93.3400
2016/01/26,99.9900,71937310.0000,99.9300,100.8800,98.0700
2016/01/25,99.4400,51529980.0000,101.5200,101.5300,99.2100
Run Code Online (Sandbox Code Playgroud)
Traceback (most recent call last):
File "plot_curves.py", line 100, in <module>
boxplot()
File "plot_curves.py", line 96, in boxplot
ax.boxplot([dataset1,dataset2])
File "/home/rohit/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 3118, in boxplot
manage_xticks=manage_xticks)
File "/home/rohit/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 3480, in bxp
flier_x, flier_y, **final_flierprops
File "/home/rohit/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 3361, in doplot
return self.plot(*args, **kwargs)
File "/home/rohit/anaconda/lib/python2.7/site-packages/matplotlib/axes/_axes.py", line 1373, in plot
for line in self._get_lines(*args, **kwargs):
File "/home/rohit/anaconda/lib/python2.7/site-packages/matplotlib/axes/_base.py", line 304, in _grab_next_args
for seg in self._plot_args(remaining, kwargs):
File "/home/rohit/anaconda/lib/python2.7/site-packages/matplotlib/axes/_base.py", line 263, in _plot_args
linestyle, marker, color = _process_plot_format(tup[-1])
File "/home/rohit/anaconda/lib/python2.7/site-packages/matplotlib/axes/_base.py", line 85, in _process_plot_format
if fmt.find('--') >= 0:
AttributeError: 'numpy.ndarray' object has no attribute 'find'
Run Code Online (Sandbox Code Playgroud)
有人有任何想法,如何解决?
问题的直接原因是,dataset1并且dataset2是ndarray类型,有dtype == object.
尽管您的值是作为float类型读入的,但是当您访问values返回的数组列时(在该行dataset1 = data1[:,ithattr1]),dtype会更改为object(因为您实际上是逐行提取数据,然后提取列并同时numpy具有浮点数和行中的字符串,因此必须强制转换为最具体的公共数据类型 - object).
你可以通过几种方式解决这个问题.一种是简单地将数组放入列表中,此时Python强制看起来像float的浮点数即改变
ax.boxplot([dataset1,dataset2])
Run Code Online (Sandbox Code Playgroud)
至
ax.boxplot([list(dataset1),list(dataset2)])
Run Code Online (Sandbox Code Playgroud)
另一种是添加明确设置类型的行:
dataset1 = dataset1.astype(np.float)
dataset2 = dataset2.astype(np.float)
Run Code Online (Sandbox Code Playgroud)
当您按索引访问列中包含不同数据类型的pandas数据帧或numpy数组时,这是一个难题.这很难调试(花了我一段时间来解决这个问题,我以前见过它 - 看编辑历史)
但是,您通过数字索引处理数据的方式也意味着您最终必须重新排序列等,以方便您的loadData功能.更好的方法是让大熊猫在类型上做所有繁重的工作......
作为一个例子 - 我把你的代码放入(我认为)更传统的pandas/python写作中.它有点短,不需要黑客将数据更改为我上面给你的列表.代码在下面并在此之后输出图表(使用您问题中的输入数据片段)
import matplotlib.pyplot as plt
import pandas as pd
import numpy as np
def loadData(filename,cols):
data = pd.read_csv(filename, quotechar='"',names=cols,header=None)
return data
def boxplot(filename,cols):
data1 = loadData(filename,cols)
fig = plt.figure()
ax = fig.add_subplot(111)
ax.boxplot([data1['high'],data1['close/last']])
plt.show()
cols=['date','close/last','volume','open','high','low']
filename = 'microsoft.csv'
boxplot(filename,cols)
Run Code Online (Sandbox Code Playgroud)