ste*_*esu 6 python dataframe pandas
我有以下两个DataFrame:
>>> history
above below
asn country
12345 US 5 4
MX 6 3
54321 MX 4 5
>>> current
above below
asn country
12345 MX 1 0
54321 MX 0 1
US 1 0
Run Code Online (Sandbox Code Playgroud)
我在historyDataFrame中保持"上方"和"下方"值的运行计数,如下所示:
>>> history = history.add(current, fill_value=0)
>>> history
above below
asn country
12345 MX 7.0 3.0
US 5.0 4.0
54321 MX 4.0 6.0
US 1.0 0.0
Run Code Online (Sandbox Code Playgroud)
只要currentDataFrame 中没有多余的列,这就可以正常工作.但是当我添加一个额外的列时:
>>> current
above below cruft
asn country
12345 MX 1 0 999
54321 MX 0 1 999
US 1 0 999
Run Code Online (Sandbox Code Playgroud)
我得到以下内容:
>>> history = history.add(current, fill_value=0)
>>> history
above below cruft
asn country
12345 MX 7.0 3.0 999.0
US 5.0 4.0 NaN
54321 MX 4.0 6.0 999.0
US 1.0 0.0 999.0
Run Code Online (Sandbox Code Playgroud)
我希望忽略这个额外的列,因为它在两个DataFrame中都不存在.所需的输出只是:
>>> history
above below
asn country
12345 MX 7.0 3.0
US 5.0 4.0
54321 MX 4.0 6.0
US 1.0 0.0
Run Code Online (Sandbox Code Playgroud)
In [27]: history.add(current, fill_value=0)[history.columns]
Out[27]:
above below
asn country
12345 MX 7.0 3.0
US 5.0 4.0
54321 MX 4.0 6.0
US 1.0 0.0
Run Code Online (Sandbox Code Playgroud)
这是一种新方式
pd.concat([df1,df2],join ='inner',axis=0).sum(level=[0,1])
Run Code Online (Sandbox Code Playgroud)