小编Kev*_*ook的帖子

What is a more efficent way to populate new variables based on other variable results in Pandas

I'm using Pandas to populate 6 new variables with values that are conditional to other data variables. The entire dataset consists of about 700,000 rows and 14 variables (columns) including my newly added ones.

My first approach was to use itertuples(), mainly down to experience being minimal here. This clocked around 9600 seconds.

I've managed to get this more efficient (~3500 seconds) by using apply(). Here is an example of one of the new variables.


housing_df = utils.make_data_frame("data/source_data/housing_with_child.dta", "stata") …
Run Code Online (Sandbox Code Playgroud)

python dataframe pandas

5
推荐指数
1
解决办法
102
查看次数

标签 统计

dataframe ×1

pandas ×1

python ×1