jen*_*ryb 9 python whitespace strip dataframe pandas
我的代码中出现错误,因为我试图通过调用csv中的元素来创建数据帧.我从文件中调用了两列:CompanyName和QualityIssue.质量问题有三种类型:设备质量,用户和两者.我遇到了试图制作数据帧df.Equipment质量的问题,这显然不起作用,因为那里有空间.我想从原始文件中获取设备质量,并用下划线替换空格.
输入:
Top Calling Customers, Equipment Quality, User, Neither,
Customer 3, 2, 2, 0,
Customer 1, 0, 2, 1,
Customer 2, 0, 1, 0,
Customer 4, 0, 1, 0,
Run Code Online (Sandbox Code Playgroud)
这是我的代码:
import numpy as np
import pandas as pd
import pandas.util.testing as tm; tm.N = 3
# Get the data.
data = pd.DataFrame.from_csv('MYDATA.csv')
# Group the data by calling CompanyName and QualityIssue columns.
byqualityissue = data.groupby(["CompanyName", "QualityIssue"]).size()
# Make a pandas dataframe of the grouped data.
df = pd.DataFrame(byqualityissue)
# Change the formatting of the data to match what I want SpiderPlot to read.
formatted = df.unstack(level=-1)[0]
# Replace NaN values with zero.
formatted[np.isnan(formatted)] = 0
includingtotals = pd.concat([formatted,pd.DataFrame(formatted.sum(axis=1),
columns=['Total'])], axis=1)
sortedtotal = includingtotals.sort_index(by=['Total'], ascending=[False])
sortedtotal.to_csv('byqualityissue.csv')
Run Code Online (Sandbox Code Playgroud)
这似乎是一个经常被问到的问题,我尝试了很多解决方案,但它们似乎没有用.这是我尝试过的:
with open('byqualityissue.csv', 'r') as f:
reader = csv.reader(f, delimiter=',', quoting=csv.QUOTE_NONE)
return [[x.strip() for x in row] for row in reader]
sentence.replace(" ", "_")
Run Code Online (Sandbox Code Playgroud)
和
sortedtotal['QualityIssue'] = sortedtotal['QualityIssue'].map(lambda x: x.rstrip(' '))
Run Code Online (Sandbox Code Playgroud)
而我认为最有希望的是http://pandas.pydata.org/pandas-docs/stable/text.html:
formatted.columns = formatted.columns.str.strip().str.replace(' ', '_')
Run Code Online (Sandbox Code Playgroud)
但是我收到了这个错误:AttributeError:'Index'对象没有属性'str'
感谢您的帮助!
尝试:
formatted.columns = [x.strip().replace(' ', '_') for x in formatted.columns]
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
16091 次 |
| 最近记录: |