Ome*_*erB 10 python dataframe pandas
假设我有两个数据帧:
>> df1
0 1 2
0 a b c
1 d e f
>> df2
0 1 2
0 A B C
1 D E F
Run Code Online (Sandbox Code Playgroud)
如何交错行? 即得到这个:
>> interleaved_df
0 1 2
0 a b c
1 A B C
2 d e f
3 D E F
Run Code Online (Sandbox Code Playgroud)
(注意我的真实DF具有相同的列,但行数不同).
import pandas as pd
from itertools import chain, zip_longest
df1 = pd.DataFrame([['a','b','c'], ['d','e','f']])
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']])
concat_df = pd.concat([df1,df2])
new_index = chain.from_iterable(zip_longest(df1.index, df2.index))
# new_index now holds the interleaved row indices
interleaved_df = concat_df.reindex(new_index)
ValueError: cannot reindex from a duplicate axis
Run Code Online (Sandbox Code Playgroud)
最后一次调用失败,因为df1和df2有一些相同的索引值(我的真实DF也是如此).
有任何想法吗?
Flo*_*oor 11
您可以在连接后对索引进行排序,然后重置索引即
import pandas as pd
df1 = pd.DataFrame([['a','b','c'], ['d','e','f']])
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']])
concat_df = pd.concat([df1,df2]).sort_index().reset_index(drop=True)
Run Code Online (Sandbox Code Playgroud)
输出:
0 1 2 0 a b c 1 A B C 2 d e f 3 D E F
编辑(OmerB):无论索引值如何,都要保持顺序.
import pandas as pd
df1 = pd.DataFrame([['a','b','c'], ['d','e','f']]).reset_index()
df2 = pd.DataFrame([['A','B','C'], ['D','E','F']]).reset_index()
concat_df = pd.concat([df1,df2]).sort_index().set_index('index')
Run Code Online (Sandbox Code Playgroud)
用 toolz.interleave
In [1024]: from toolz import interleave
In [1025]: pd.DataFrame(interleave([df1.values, df2.values]))
Out[1025]:
0 1 2
0 a b c
1 A B C
2 d e f
3 D E F
Run Code Online (Sandbox Code Playgroud)