我正在尝试使用np.random.shuffle()方法调整索引,但是我不断收到我不理解的错误。如果有人可以帮助我解决这个问题,我将不胜感激。谢谢!
我一开始就将raw_csv_data变量设为变量时,曾尝试使用delimiter =','和delim_whitespace = 0,因为我将其视为另一个问题的解决方案,但它始终抛出相同的错误
import pandas as pd
import numpy as np
from sklearn.preprocessing import StandardScaler
#%%
raw_csv_data= pd.read_csv('Absenteeism-data.csv')
print(raw_csv_data)
#%%
df= raw_csv_data.copy()
print(display(df))
#%%
pd.options.display.max_columns=None
pd.options.display.max_rows=None
print(display(df))
#%%
print(df.info())
#%%
df=df.drop(['ID'], axis=1)
#%%
print(display(df.head()))
#%%
#Our goal is to see who is more likely to be absent. Let's define
#our targets from our dependent variable, Absenteeism Time in Hours
print(df['Absenteeism Time in Hours'])
print(df['Absenteeism Time in Hours'].median())
#%%
targets= np.where(df['Absenteeism Time in Hours']>df['Absenteeism Time
in Hours'].median(),1,0)
#%%
print(targets) …Run Code Online (Sandbox Code Playgroud) 我有这样的数据帧:
match_id inn1 bat bowl runs1 inn2 runs2 is_score_chased
1 1 KKR RCB 222 2 82 1
2 1 CSK KXIP 240 2 207 1
8 1 CSK MI 208 2 202 1
9 1 DC RR 214 2 217 1
33 1 KKR DC 204 2 181 1
Run Code Online (Sandbox Code Playgroud)
现在我想通过比较runs1和runs2中的值来更改is_score_chased列中的值.如果runs1> runs2,则行中的相应值应为"yes",否则应为no.我尝试了以下代码:
for i in (high_scores1):
if(high_scores1['runs1']>=high_scores1['runs2']):
high_scores1['is_score_chased']='yes'
else:
high_scores1['is_score_chased']='no'
Run Code Online (Sandbox Code Playgroud)
但它没有用.如何更改列中的值?