Pandas read_table使用第一列作为索引

Question

Pandas read_table使用第一列作为索引

我这里有一点问题.我有一个包含表格行的txt文件(假设第1行):

id1-a1-b1-c1

Run Code Online (Sandbox Code Playgroud)

我想使用pandas将其加载到数据框中,其中索引为id,列名为'A','B','C',值为相应的ai,bi,ci

最后我希望数据框看起来像:

    'A'   'B'  'C'
id1  a1    b1   c1
id2  a2    b2   c2
...   ...   ...  ...

Run Code Online (Sandbox Code Playgroud)

我可能想要通过文件中的块读取很大但我们假设我立即阅读:

with open('file.txt') as f:
    table = pd.read_table(f, sep='-', index_col=0, header=None,   lineterminator='\n')

Run Code Online (Sandbox Code Playgroud)

并重命名列

table.columns = ['A','B','C']

Run Code Online (Sandbox Code Playgroud)

我目前的输出是这样的:

    'A'   'B'  'C'
0
id1  a1    b1   c1
id2  a2    b2   c2
...   ...   ...  ...

Run Code Online (Sandbox Code Playgroud)

还有一行我无法解释

谢谢

编辑

当我尝试添加字段时

chunksize=20

Run Code Online (Sandbox Code Playgroud)

做完之后:

for chunk in table:
    print(chunk)

Run Code Online (Sandbox Code Playgroud)

我收到以下错误:

pandas.parser.CParserError: Error tokenizing data. C error: Calling read(nbytes) on source failed. Try engine='python'.

Run Code Online (Sandbox Code Playgroud)

Answer 1

Bry*_*yan 14

如果在读取文件之前知道列名,请使用read_tablenames参数传递列表:

with open('file.txt') as f:
    table = pd.read_table(f, sep='-', index_col=0, header=None, names=['A','B','C'],
                          lineterminator='\n')

Run Code Online (Sandbox Code Playgroud)

哪个输出:

      A   B   C
id1  a1  b1  c1
id2  a2  b2  c2

Run Code Online (Sandbox Code Playgroud)

归档时间：	11 年前
查看次数：	32088 次
最近记录：	11 年前