我有一个数据框df:
id volume saturation time_delay_normalised speed BPR_free_speed BPR_speed Volume time_normalised
27WESTBOUND 580 0.351515152 57 6.54248366 17.88 15.91366177 580 1.59375
27WESTBOUND 588 0.356363636 100 5.107142857 17.88 15.86519847 588 2.041666667
27WESTBOUND 475 0.287878788 64 6.25625 17.88 16.51161331 475 0.666666667
27EASTBOUND 401 0.243030303 59 6.458064516 17.88 16.88283672 401 1.0914583333
27EASTBOUND 438 0.265454545 46 7.049295775 17.88 16.70300418 438 1.479166667
27EASTBOUND 467 0.283030303 58 6.5 17.88 16.55392848 467 0.9604166667
Run Code Online (Sandbox Code Playgroud)
我希望创建一个新的列,free_capacity并将其设置为最大值Volume,每ID当time_normalised大于或等于小于1.1
不考虑time_normalized条件,我可以这样做:
df['free_capacity'] = df.groupby('id')["Volume"].transform('max')
Run Code Online (Sandbox Code Playgroud)
如何添加when time_normalised <= 1.1条件?
编辑
@jezrael建议以下内容:
df.loc[df['time_normalised'] <= 1.1, 'free_capacity'] = df.loc[df['time_normalised'] <= 1.1].groupby('id')["Volume"].transform('max')
Run Code Online (Sandbox Code Playgroud)
这使:
id volume saturation time_delay_normalised speed \
27WESTBOUND 580 0.351515 57 6.542484
27WESTBOUND 588 0.356364 100 5.107143
27WESTBOUND 475 0.287879 64 6.256250
27EASTBOUND 401 0.243030 59 6.458065
27EASTBOUND 438 0.265455 46 7.049296
27EASTBOUND 467 0.283030 58 6.500000
BPR_free_speed BPR_speed Volume time_normalised free_capacity
17.88 15.913662 580 1.593750 NaN
17.88 15.865198 588 2.041667 NaN
17.88 16.511613 475 0.666667 475.0
17.88 16.882837 401 1.091458 467.0
17.88 16.703004 438 1.479167 NaN
17.88 16.553928 467 0.960417 467.0
Run Code Online (Sandbox Code Playgroud)
但是,我仍然希望将free_capacity的值归因于 id
因此,我尝试:
df['free_capacity'] = df.loc[df['time_normalised'] <= 1.1].groupby('id')["Volume"].transform('max')
Run Code Online (Sandbox Code Playgroud)
但是,这仍然会导致NaN值。1.1 time_normalized条件是用于查找值,而不是限制其应用。
预期的结果:
id volume saturation time_delay_normalised speed \
27WESTBOUND 580 0.351515 57 6.542484
27WESTBOUND 588 0.356364 100 5.107143
27WESTBOUND 475 0.287879 64 6.256250
27EASTBOUND 401 0.243030 59 6.458065
27EASTBOUND 438 0.265455 46 7.049296
27EASTBOUND 467 0.283030 58 6.500000
BPR_free_speed BPR_speed Volume time_normalised free_capacity
17.88 15.913662 580 1.593750 475.0
17.88 15.865198 588 2.041667 475.0
17.88 16.511613 475 0.666667 475.0
17.88 16.882837 401 1.091458 467.0
17.88 16.703004 438 1.479167 467.0
17.88 16.553928 467 0.960417 467.0
Run Code Online (Sandbox Code Playgroud)
您可以使用where由条件筛选,然后groupby通过Series df['id']用transform:
df['free_capacity'] = df['Volume'].where(df['time_normalised'] <= 1.1)
.groupby(df['id'])
.transform('max')
print df
id volume saturation time_delay_normalised speed \
0 27WESTBOUND 580 0.351515 57 6.542484
1 27WESTBOUND 588 0.356364 100 5.107143
2 27WESTBOUND 475 0.287879 64 6.256250
3 27EASTBOUND 401 0.243030 59 6.458065
4 27EASTBOUND 438 0.265455 46 7.049296
5 27EASTBOUND 467 0.283030 58 6.500000
BPR_free_speed BPR_speed Volume time_normalised free_capacity
0 17.88 15.913662 580 1.593750 475.0
1 17.88 15.865198 588 2.041667 475.0
2 17.88 16.511613 475 0.666667 475.0
3 17.88 16.882837 401 1.091458 467.0
4 17.88 16.703004 438 1.479167 467.0
5 17.88 16.553928 467 0.960417 467.0
Run Code Online (Sandbox Code Playgroud)
如果根据您的条件where用于创建新列,则相同Volume1:
df['Volume1'] = df['Volume'].where(df['time_normalised'] <= 1.1)
print df
id volume saturation time_delay_normalised speed \
0 27WESTBOUND 580 0.351515 57 6.542484
1 27WESTBOUND 588 0.356364 100 5.107143
2 27WESTBOUND 475 0.287879 64 6.256250
3 27EASTBOUND 401 0.243030 59 6.458065
4 27EASTBOUND 438 0.265455 46 7.049296
5 27EASTBOUND 467 0.283030 58 6.500000
BPR_free_speed BPR_speed Volume time_normalised Volume1
0 17.88 15.913662 580 1.593750 NaN
1 17.88 15.865198 588 2.041667 NaN
2 17.88 16.511613 475 0.666667 475.0
3 17.88 16.882837 401 1.091458 401.0
4 17.88 16.703004 438 1.479167 NaN
5 17.88 16.553928 467 0.960417 467.0
Run Code Online (Sandbox Code Playgroud)
df['free_capacity'] = df.groupby('id')["Volume1"].transform('max')
print df
id volume saturation time_delay_normalised speed \
0 27WESTBOUND 580 0.351515 57 6.542484
1 27WESTBOUND 588 0.356364 100 5.107143
2 27WESTBOUND 475 0.287879 64 6.256250
3 27EASTBOUND 401 0.243030 59 6.458065
4 27EASTBOUND 438 0.265455 46 7.049296
5 27EASTBOUND 467 0.283030 58 6.500000
BPR_free_speed BPR_speed Volume time_normalised Volume1 free_capacity
0 17.88 15.913662 580 1.593750 NaN 475.0
1 17.88 15.865198 588 2.041667 NaN 475.0
2 17.88 16.511613 475 0.666667 475.0 475.0
3 17.88 16.882837 401 1.091458 401.0 467.0
4 17.88 16.703004 438 1.479167 NaN 467.0
5 17.88 16.553928 467 0.960417 467.0 467.0
Run Code Online (Sandbox Code Playgroud)