数据读取 - csv

Question

数据读取 - csv

我在.dfx文件中有一些数据,我试图将其作为带有pandas的csv读取.但它有一些特殊的字符,不能被熊猫读取.它们也是分隔符.我附上了一行

打印文件时,"DC4"将被删除.正确地将SI读作空格.我尝试了一些编码(utf-8,latin1等),但没有成功. 我也附上了印刷的第一行.我标记了角色所在的位置.

我的代码很简单:

import pandas

file_log = pandas.read_csv("file_log.DFX", header=None)

print(file_log)

Run Code Online (Sandbox Code Playgroud)

我希望我很清楚,有人有想法.提前致谢!

编辑:

输入.链接:drive.google.com/open？id = 0BxMDhep-LHOIVGcybmsya2JVM28

预期产量:

88.4373 0 12.07.2014/17:05:22 38.0366  38.5179 1.3448 31.9839
30.0070 0 12.07.2014/17:14:27 38.0084  38.5091 0.0056 0.0033

Run Code Online (Sandbox Code Playgroud)

Answer 1

小智 5

通过检查example.DFX以十六进制(带有xxd),两个分离器0x14和0x0f相应.

使用python引擎读取带有多个分隔符的csv:

import pandas

sep1 = chr(0x14) # the one shows dc4
sep2 = chr(0x0f) # the one shows si
file_log = pandas.read_csv('example.DFX', header=None, sep='{}|{}'.format(sep1, sep2), engine='python')

print file_log

Run Code Online (Sandbox Code Playgroud)

你得到:

         0  1                    2        3        4       5        6   7
0  88.4373  0  12.07.2014/17:05:22  38.0366  38.5179  1.3448  31.9839 NaN
1  30.0070  0  12.07.2014/17:14:27  38.0084  38.5091  0.0056   0.0033 NaN

Run Code Online (Sandbox Code Playgroud)

它似乎最后有一个空列.但我相信你能解决这个问题.

归档时间：	9 年，3 月前
查看次数：	82 次
最近记录：	9 年，3 月前