熊猫加入2列

Question

熊猫加入2列

我在使用这两个dfs加入我想要的方式时遇到了一些麻烦.第一个df有一个分层索引,我用它df1 = df3.groupby(["STATE_PROV_CODE", "COUNTY"]).size()来获取每个县的计数.

STATE_PROV_CODE  COUNTY            COUNT
AL               Autauga County      1
                 Baldwin County      1
                 Barbour County      1
                 Bibb County         1
                 Blount County       1

    STATE_PROV_CODE COUNTY  ANSI Cl FIPS
0   AL  Autauga County  H1  01001
1   AL  Baldwin County  H1  01003
2   AL  Barbour County  H1  01005
3   AL  Bibb County     H1  01007
4   AL  Blount County   H1  01009

Run Code Online (Sandbox Code Playgroud)

在SQL中我想做以下事情:

SELECT STATE_PROV_CODE, COUNTY, FIPS, COUNT,
FROM df1, df2
ON STATE_PROV_CODE, COUNTY
WHERE df1.STATE_PROV_CODE = df2.STATE_PROV_CODE
AND df1.COUNTY = df2.COUNTY

Run Code Online (Sandbox Code Playgroud)

我希望结果如下:

STATE_PROV_CODE  COUNTY            COUNT    FIPS
AL               Autauga County      1     01001
                 Baldwin County      1     01003
                 Barbour County      1     01005
                 Bibb County         1     01007
                 Blount County       1     01009

Run Code Online (Sandbox Code Playgroud)

Answer 1

Aje*_*ean 2

我相信您设置 groupby 结果和第二个数据帧的方式，此合并调用将起作用：

df = pd.merge(df1, df2, left_index=True, right_on=['STATE_PROV_CODE', 'COUNTY'])

Run Code Online (Sandbox Code Playgroud)

它将解开多重索引；然而，如果你想把它拿回来，你所要做的就是

df = df.set_index(['STATE_PROV_CODE', 'COUNTY'])

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年，4 月前
查看次数：	174 次
最近记录：	11 年，4 月前