55t*_*iss 2 python iteration pandas
我正在规范Pandas中的一些数据,并且插入需要很长时间.数学似乎相对简单,只有约2500行.有更快的方法吗?
如下所示,我手动完成了规范化.
# normalize the rating columns to values between 0 and 1
df_1['numerator_norm'] = ((df_1['rating_numerator']- df_1['rating_numerator'].min())/(df_1['rating_numerator'].max()- df_1['rating_numerator'].min()))
df_1['denominator_norm'] = ((df_1['rating_denominator']- df_1['rating_denominator'].min())/(df_1['rating_denominator'].max()- df_1['rating_denominator'].min()))
df_1['normalized_rating'] = np.nan
for index, row in df_1.iterrows():
df_1['normalized_rating'][index] = (df_1['numerator_norm'][index] / df_1['denominator_norm'][index])
Run Code Online (Sandbox Code Playgroud)
这个过程只需几秒钟而不是60秒就可以了
更改:
for index, row in df_1.iterrows():
df_1['normalized_rating'][index] = (df_1['numerator_norm'][index] /
df_1['denominator_norm'][index])
Run Code Online (Sandbox Code Playgroud)
至:
df_1['normalized_rating'] = df_1['numerator_norm'] / df_1['denominator_norm']
Run Code Online (Sandbox Code Playgroud)
用于矢量化除法.
Iterrows最好避免,检查iterrowrows是否有性能问题?
| 归档时间: |
|
| 查看次数: |
55 次 |
| 最近记录: |