Lar*_*Cai 5 python numpy dataframe pandas
它是pandas/Dataframe,它包含了每个人每天的所有分数,我想多加一列来收集它有多少次得分最高(可能不止一个人,有些数据是nan)
import pandas as pd
import numpy as np
data = np.array([['','day1','day2','day3','day4','day5'],
['larry',1,4,7,3,5],
['niko',2,-1,3,6,4],
['tin',np.nan,5,5, 6,7]])
df = pd.DataFrame(data=data[1:,1:],
index=data[1:,0],
columns=data[0,1:])
print(df)
Run Code Online (Sandbox Code Playgroud)
输出
day1 day2 day3 day4 day5
larry 1 4 7 3 5
niko 2 -1 3 6 4
tin nan 5 5 6 7
Run Code Online (Sandbox Code Playgroud)
预期结果是(拉里:1 次,尼科:2 次,锡:3 次)
times_of_top day1 day2 day3 day4 day5
larry 1 1 4 7 3 5
niko 2 2 -1 3 6 4
tin 3 nan 5 5 6 7
Run Code Online (Sandbox Code Playgroud)
niko对得分最高day1,并day4因此他times_of_top是2
tin对得分最高day2,day4并day5因此他times_of_top是3。
使用pandas.DataFrame.stackand 的一种方式count:
# df = df.astype(float)
# Since the sample data are in object type
df["times_of_top"] = df[df == df.max()].stack().count(0)
print(df)
Run Code Online (Sandbox Code Playgroud)
输出:
day1 day2 day3 day4 day5 times_of_top
larry 1.0 4.0 7.0 3.0 5.0 1
niko 2.0 -1.0 3.0 6.0 4.0 2
tin NaN 5.0 5.0 6.0 7.0 3
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
48 次 |
| 最近记录: |