像Qlik一样在Pandas中做Crosstable?

Alh*_*lta 3 python qlikview pandas qliksense

我有一个数据帧:

    df1=pd.DataFrame({
        'ID':[101,102],
        'Name':['Axel','Bob'],
        'US':['GrA','GrC'],
        'Europe':['GrB','GrD'],
        'AsiaPac':['GrZ','GrF']
     })
Run Code Online (Sandbox Code Playgroud)

我想改成这个:

    df2=pd.DataFrame({
    'ID':[101,101,101,102,102,102],
    'Name':['Axel','Axel','Axel','Bob','Bob','Bob'],
    'Region':['US','Europe','AsiaPac','US','Europe','AsiaPac'],
    'Group':['GrA','GrB','GrZ','GrC','GrD','GrF']
})
Run Code Online (Sandbox Code Playgroud)

我该怎么做?pandas中有一个交叉表功能,但它没有这样做.在Qlik,我会这样做

    Crosstable(Region,Group,2)  
    LOAD
        ID,
        Name,
        US,
        Europe,
        AsiaPac
Run Code Online (Sandbox Code Playgroud)

我会从df1到df2.我怎么能在python(熊猫或其他)中做到这一点?

cma*_*her 7

这实际上是将您的数据从宽格式转换为长格式,正如R语言中所知.在熊猫中,你可以这样做pd.melt:

pd.melt(df1, id_vars=['ID', 'Name'], var_name='Region', value_name='Group')
#     ID  Name   Region Group
# 0  101  Axel  AsiaPac   GrZ
# 1  102   Bob  AsiaPac   GrF
# 2  101  Axel   Europe   GrB
# 3  102   Bob   Europe   GrD
# 4  101  Axel       US   GrA
# 5  102   Bob       US   GrC
Run Code Online (Sandbox Code Playgroud)

如果你需要整理你的列IDNameGroup,作为你的榜样输出,您可以添加.sort_values()到表达式:

pd.melt(df1, id_vars=['ID', 'Name'], var_name='Region', value_name='Group').sort_values(['ID', 'Group'])
#     ID  Name   Region Group
# 4  101  Axel       US   GrA
# 2  101  Axel   Europe   GrB
# 0  101  Axel  AsiaPac   GrZ
# 5  102   Bob       US   GrC
# 3  102   Bob   Europe   GrD
# 1  102   Bob  AsiaPac   GrF
Run Code Online (Sandbox Code Playgroud)