为什么迭代运行得如此之慢?

55t*_*iss 2 python iteration pandas

我正在规范Pandas中的一些数据,并且插入需要很长时间.数学似乎相对简单,只有约2500行.有更快的方法吗?

如下所示,我手动完成了规范化.

# normalize the rating columns to values between 0 and 1
df_1['numerator_norm'] = ((df_1['rating_numerator']- df_1['rating_numerator'].min())/(df_1['rating_numerator'].max()- df_1['rating_numerator'].min()))
df_1['denominator_norm'] = ((df_1['rating_denominator']- df_1['rating_denominator'].min())/(df_1['rating_denominator'].max()- df_1['rating_denominator'].min()))
df_1['normalized_rating'] = np.nan

for index, row in df_1.iterrows():
    df_1['normalized_rating'][index] = (df_1['numerator_norm'][index] / df_1['denominator_norm'][index])
Run Code Online (Sandbox Code Playgroud)

这个过程只需几秒钟而不是60秒就可以了

jez*_*ael 6

更改:

for index, row in df_1.iterrows():
    df_1['normalized_rating'][index] = (df_1['numerator_norm'][index] / 
df_1['denominator_norm'][index])
Run Code Online (Sandbox Code Playgroud)

至:

df_1['normalized_rating'] = df_1['numerator_norm'] / df_1['denominator_norm']
Run Code Online (Sandbox Code Playgroud)

用于矢量化除法.

Iterrows最好避免,检查iterrowrows是否有性能问题?