我有一个txt文档,其结构如下:
1:0.84722,0.52855;0.65268,0.24792;0.66525,0.46562
2:0.84722,0.52855;0.65231,0.24513;0.66482,0.46548
3:0.84722,0.52855;0.65197,0.24387;0.66467,0.46537
Run Code Online (Sandbox Code Playgroud)
第一个带冒号的数字是索引,我不知道打开文件时如何指示它。确实我想把它抹掉。然后数据用逗号和分号分隔,我希望每个数字都在不同的列中,无论分隔符是逗号还是分号。我怎样才能做到呢?
使用以下命令通过pd.read_csv加载 csv :
import pandas as pd
df = pd.read_csv("data.csv", # the file path, change it to your filename
sep="[,;:]", # the separator use a regular expression
engine="python", # need this to use regular expression as sep
usecols=range(1, 7), # use columns from [1, 7)
header=None # no header
)
print(df)
Run Code Online (Sandbox Code Playgroud)
输出
1 2 3 4 5 6
0 0.84722 0.52855 0.65268 0.24792 0.66525 0.46562
1 0.84722 0.52855 0.65231 0.24513 0.66482 0.46548
2 0.84722 0.52855 0.65197 0.24387 0.66467 0.46537
Run Code Online (Sandbox Code Playgroud)
注意
加载文件后,我建议将其保存(使用to_csv)作为正确的csv 文件。