使用循环填充空python数据帧

ccs*_*csv 6 python iteration pandas

假设我想用循环中的值创建和填充空数据框.

import pandas as pd
import numpy as np

years = [2013, 2014, 2015]
dn=pd.DataFrame()
for year in years:
    df1 = pd.DataFrame({'Incidents': [ 'C', 'B','A'],
                 year: [1, 1, 1 ],
                }).set_index('Incidents')
    print (df1)
    dn=dn.append(df1, ignore_index = False)
Run Code Online (Sandbox Code Playgroud)

即使忽略index为false,append也会给出一个对角矩阵:

>>> dn
       2013  2014  2015
Incidents                  
C             1   NaN   NaN
B             1   NaN   NaN
A             1   NaN   NaN
C           NaN     1   NaN
B           NaN     1   NaN
A           NaN     1   NaN
C           NaN   NaN     1
B           NaN   NaN     1
A           NaN   NaN     1

[9 rows x 3 columns]
Run Code Online (Sandbox Code Playgroud)

它应该如下所示:

>>> dn
       2013  2014  2015
Incidents                  
C             1   1   1
B             1   1   1
A             1   1   1

[3 rows x 3 columns]
Run Code Online (Sandbox Code Playgroud)

有没有更好的方法呢?有没有办法解决附加问题?

我有熊猫版'0.13.1-557-g300610e'

unu*_*tbu 11

import pandas as pd

years = [2013, 2014, 2015]
dn = []
for year in years:
    df1 = pd.DataFrame({'Incidents': [ 'C', 'B','A'],
                 year: [1, 1, 1 ],
                }).set_index('Incidents')
    dn.append(df1)
dn = pd.concat(dn, axis=1)
print(dn)
Run Code Online (Sandbox Code Playgroud)

产量

           2013  2014  2015
Incidents                  
C             1     1     1
B             1     1     1
A             1     1     1
Run Code Online (Sandbox Code Playgroud)

请注意,在循环外调用pd.concat 一次pd.concat循环的每次迭代调用更节省时间.

每次调用pd.concat新空间时都会为新的DataFrame分配,并且每个组件DataFrame的所有数据都会复制到新的DataFrame中.如果你pd.concat从for循环中调用那么你最终会按照n**2副本的顺序进行操作,其中n是年份.

如果您在列表中累积部分DataFrame并在列表pd.concat外调用一次,那么Pandas只需要执行n副本dn.