字段中的Python csv换行符

Question

字段中的Python csv换行符

我在读取以荆棘分隔的 csv 文件时遇到问题，我认为该文件在其中一个字段中有一个换行符。它迫使该行超过两行，因此我无法读取该行最后一个字段中的值。我试过打开，new line mode但不确定最好的方法是什么。

这就是我试图读取文件的方式python：

csv.register_dialect('BB', delimiter='\xfe')
with open(file, 'rU') as file_in: 
    log=csv.reader(file_in, dialect='BB')
    for row in log:
        print row

Run Code Online (Sandbox Code Playgroud)

这适用于大部分文件，但我认为有一行在其中一个字段中有一个换行符 - 我不确定如何最好地诊断它。这是该行在记事本中的外观的屏幕截图，您可以看到当它看起来像下面的两行时，它会强制将该行分成两行。

用csv.reader行读取这个看起来像这样：

['06-13-2015-10:13:41', '0', '', '', '', '', '', '', '', '', '', '', ' 142', '', '5', '7.0', '2', '', 'cmhkl966', 'amex_674', '1', '0.00', '', '', "' "]

即在第一个撇号处被截断。

Answer 1

hir*_*ist 0

我稍微简化了你的问题（希望我抓住了问题的原因）：

\n\n

import io\nimport csv\n\nfile_in = io.StringIO(\'\'\'\na\xc3\xbeb\xc3\xbe\'hello\nworld\'\n\'\'\')\n\nlog=csv.reader(file_in, delimiter=\'\\xfe\', quotechar="\'")\nfor row in log:\n    print(row)\n

Run Code Online (Sandbox Code Playgroud)\n\n

输出：

\n\n

[\'a\', \'b\', \'hello\\nworld\']\n

Run Code Online (Sandbox Code Playgroud)\n\n

\n\n

更新：

\n\n

.csv按照评论中的要求：这里是从文件读取的版本。的内容test.csv是：

\n\n

a\xc3\xbeb\xc3\xbe\'hello\nworld\'\xc3\xbec\nd\xc3\xbee\xc3\xbe\'hello\nother\nthings\'\xc3\xbef\ng\xc3\xbeh\xc3\xbei\xc3\xbej\n

Run Code Online (Sandbox Code Playgroud)\n\n

和Python代码：

\n\n

import csv\nfrom pathlib import Path\n\nHERE = Path(__file__).parent\nDATA_PATH = HERE / \'../data/test.csv\'\n\nwith DATA_PATH.open(\'rU\') as file_in:\n    log=csv.reader(file_in, delimiter=\'\\xfe\', quotechar="\'")\n    for row in log:\n        print(row)\n

Run Code Online (Sandbox Code Playgroud)\n\n

其输出：

\n\n

[\'a\', \'b\', \'hello\\nworld\', \'c\']\n[\'d\', \'e\', \'hello\\nother\\nthings\', \'f\']\n[\'g\', \'h\', \'i\', \'j\']\n

Run Code Online (Sandbox Code Playgroud)\n

归档时间：	10 年，6 月前
查看次数：	2158 次
最近记录：	10 年，6 月前