import pandas as pd
path1 = "/home/supertramp/Desktop/100&life_180_data.csv"
mydf = pd.read_csv(path1)
numcigar = {"Never":0 ,"1-5 Cigarettes/day" :1,"10-20 Cigarettes/day":4}
print mydf['Cigarettes']
mydf['CigarNum'] = mydf['Cigarettes'].apply(numcigar.get).astype(float)
print mydf['CigarNum']
mydf.to_csv('/home/supertramp/Desktop/powerRangers.csv')
Run Code Online (Sandbox Code Playgroud)
csv文件"100&life_180_data.csv"包含age,bmi,Cigarettes,Alocohol等列.
No int64
Age int64
BMI float64
Alcohol object
Cigarettes object
dtype: object
Run Code Online (Sandbox Code Playgroud)
香烟专栏包含"Never""1-5 Cigarettes/day","10-20 Cigarettes/day".我想为这些物体分配重量(从不,1-5根香烟/天,......)
预期的输出是附加的新列CigarNum,其仅包含数字0,1,2 CigarNum如预期的那样直到8行然后显示Nan直到CigarNum列中的最后一行
0 Never
1 Never
2 1-5 Cigarettes/day
3 Never
4 Never
5 Never
6 Never
7 Never
8 Never
9 Never
10 Never
11 Never
12 10-20 Cigarettes/day
13 1-5 Cigarettes/day
14 Never
...
167 Never
168 …Run Code Online (Sandbox Code Playgroud) 我遇到了错误
'str' 和 'int' 实例之间不支持 '>'
尝试在 Pandas 数据框中打印以下行时
print (survey_df_clean.shape)
print (survey_df_clean[survey_df_clean['text']>30].shape)
Run Code Online (Sandbox Code Playgroud)
我应该尝试将它们转换为 int 吗?在这个语句中这将如何工作?
我有一个包含美国国会传记数据的.csv ,我将其作为 Panda df 阅读:
df = pd.read_csv('congress100.csv', delimiter = ';', names = ['Name', 'Position', 'Party', 'State', 'Congress'], header = 0)
Run Code Online (Sandbox Code Playgroud)
我的数据框如下所示:
0 'ACKERMAN, Gary Leonard' 'Representative' 'Democrat' 'NY' '100(1987-1988)'
1 'ADAMS, Brockman (Brock)' 'Senator' 'Democrat' 'WA' '100(1987-1988)'
2 'AKAKA, Daniel Kahikina' 'Representative' 'Democrat' 'HI' '100(1987-1988)'
3 'ALEXANDER, William Vollie (Bill), Jr.' 'Representative' 'Democrat' 'AR' '100(1987-1988)'
4 'ANDERSON, Glenn Malcolm' 'Representative' 'Democrat' 'CA' '100(1987-1988)'
5 'ANDREWS, Michael Allen' 'Representative' 'Democrat' 'TX' '100(1987-1988)'
6 'ANNUNZIO, Frank' 'Representative' 'Democrat' 'IL' …Run Code Online (Sandbox Code Playgroud)