在PANDAS中每隔n行将数据转置到一列中

Mol*_*hao 5 python transpose reshape dataframe pandas

对于一个研究项目,我需要将每个人的信息从网站处理成一个excel文件.我已经将我需要的所有东西从网站复制并粘贴到excel文件中的一个列上,然后我使用PANDAS加载了该文件.但是,我需要水平呈现每个人的信息,而不是像现在一样垂直呈现.例如,这就是我现在所拥有的.我只有一列无组织数据.

df= pd.read_csv("ior work.csv", encoding = "ISO-8859-1")
Run Code Online (Sandbox Code Playgroud)

数据:

0 Andrew
1 School of Music
2 Music: Sound of the wind
3 Dr. Seuss
4 Dr.Sass
5 Michelle
6 School of Theatrics
7 Music: Voice
8 Dr. A
9 Dr. B
Run Code Online (Sandbox Code Playgroud)

我希望每5行转置一次,将数据组织成这种组织格式; 下面的标签是列的标签.

Name School Music Mentor1 Mentor2
Run Code Online (Sandbox Code Playgroud)

最有效的方法是什么?

jez*_*ael 7

如果没有数据丢失,可以使用numpy.reshape

print (np.reshape(df.values,(2,5)))
[['Andrew' 'School of Music' 'Music: Sound of the wind' 'Dr. Seuss'
  'Dr.Sass']
 ['Michelle' 'School of Theatrics' 'Music: Voice' 'Dr. A' 'Dr. B']]

print (pd.DataFrame(np.reshape(df.values,(2,5)), 
                    columns=['Name','School','Music','Mentor1','Mentor2']))
       Name               School                     Music    Mentor1  Mentor2
0    Andrew      School of Music  Music: Sound of the wind  Dr. Seuss  Dr.Sass
1  Michelle  School of Theatrics              Music: Voice      Dr. A    Dr. B
Run Code Online (Sandbox Code Playgroud)

通过除以列数生成lengtharray的更通用的解决方案shape

print (pd.DataFrame(np.reshape(df.values,(df.shape[0] / 5,5)), 
                    columns=['Name','School','Music','Mentor1','Mentor2']))
       Name               School                     Music    Mentor1  Mentor2
0    Andrew      School of Music  Music: Sound of the wind  Dr. Seuss  Dr.Sass
1  Michelle  School of Theatrics              Music: Voice      Dr. A    Dr. B
Run Code Online (Sandbox Code Playgroud)

谢谢piRSquared提供了另一个解决方案:

print (pd.DataFrame(df.values.reshape(-1, 5), 
                    columns=['Name','School','Music','Mentor1','Mentor2']))
Run Code Online (Sandbox Code Playgroud)

  • pd.DataFrame(df.values.reshape(-1,5),column = ['Name','School','Music','Mentor1','Mentor2']))) (3认同)