csv writer 输出在一列中

mar*_*ins 0 python csv python-3.x

我解析了一些txt文件并获得以下列表:

\n
price = [\'S-1\', \'20040319\', \'\\t\\t\\t\\tDIGIRAD CORP\', \'\\t\\t0000707388\', \'price to be between $and $per \', \'S-1\', \'20040408\', \'\\t\\t\\t\\tBUCYRUS INTERNATIONAL INC\', \'\\t\\t0000740761\', \'S-1\', \'20041027\', \'\\t\\t\\t\\tBUCYRUS INTERNATIONAL INC\', \'\\t\\t0000740761\', \'S-1\', \'20050630\', \'\\t\\t\\t\\tSEALY CORP\', \'\\t\\t0000748015\', \'S-1\', \'20140512\', \'\\t\\t\\t\\tCITIZENS FINANCIAL GROUP INC/RI\', \'\\t\\t0000759944\', \'initial public offering and no public market exists for our shares. We anticipate that the initial public offering price will be between $and\', \'S-1\', \'20110523\', \'\\t\\t\\t\\tCeres, Inc.\', \'\\t\\t0000767884\', \'    aggregate capital expenditures will be between $0.3&#160;million\', \'S-1\', \'20171023\', \'\\t\\t\\t\\tBLUEGREEN VACATIONS CORP\', \'\\t\\t0000778946\', \'        <div style="margin-top:14pt; text-align:justify; line-height:12pt;">This is the initial public offering of Bluegreen Vacations Corporation. We are offering &#8194;&#8194; shares of our common stock and the selling shareholder identified in this prospectus is offering &#8194;&#8194; shares of our common stock. We will not receive any of the proceeds from the sale of shares by the selling shareholder. We anticipate that the initial public offering price of our common stock will be between $&#8199;&#8199; and $&#8199;&#8199; per \', \'S-1\', \'20020813\', \'\\t\\t\\t\\tVISTACARE INC\', \'\\t\\t0000787030\']\n
Run Code Online (Sandbox Code Playgroud)\n

我想要的输出是一个csvS-1文件,其中每一行都以每个“ ”文档(对应于不同的公司)开头。因此,我编写了第二个列表,该列表从 every 开始创建上述子列表\xe2\x80\x99S-1\xe2\x80\x99

\n
price2 = [s.strip(\'|\').split(\'|\') for s in re.split(r\'(?=S-1)\', \'|\'.join(price)) if s]\nprint(price2)\n[[\'S-1\', \'20040319\', \'\\t\\t\\t\\tDIGIRAD CORP\', \'\\t\\t0000707388\', \'price to be between $and $per \'], [\'S-1\', \'20040408\', \'\\t\\t\\t\\tBUCYRUS INTERNATIONAL INC\', \'\\t\\t0000740761\'], [\'S-1\', \'20041027\', \'\\t\\t\\t\\tBUCYRUS INTERNATIONAL INC\', \'\\t\\t0000740761\'], [\'S-1\', \'20050630\', \'\\t\\t\\t\\tSEALY CORP\', \'\\t\\t0000748015\'], [\'S-1\', \'20140512\', \'\\t\\t\\t\\tCITIZENS FINANCIAL GROUP INC/RI\', \'\\t\\t0000759944\', \'initial public offering and no public market exists for our shares. We anticipate that the initial public offering price will be between $and\'], [\'S-1\', \'20110523\', \'\\t\\t\\t\\tCeres, Inc.\', \'\\t\\t0000767884\', \'    aggregate capital expenditures will be between $0.3&#160;million\'], [\'S-1\', \'20171023\', \'\\t\\t\\t\\tBLUEGREEN VACATIONS CORP\', \'\\t\\t0000778946\', \'        <div style="margin-top:14pt; text-align:justify; line-height:12pt;">This is the initial public offering of Bluegreen Vacations Corporation. We are offering &#8194;&#8194; shares of our common stock and the selling shareholder identified in this prospectus is offering &#8194;&#8194; shares of our common stock. We will not receive any of the proceeds from the sale of shares by the selling shareholder. We anticipate that the initial public offering price of our common stock will be between $&#8199;&#8199; and $&#8199;&#8199; per \'], [\'S-1\', \'20020813\', \'\\t\\t\\t\\tVISTACARE INC\', \'\\t\\t0000787030\']]\n
Run Code Online (Sandbox Code Playgroud)\n

然后我在csv文件上写入:

\n
with open(\'pricerange.csv\', \'w\') as out_file:\n    wr = csv.writer(out_file)\n    wr.writerow(["file_form", "filedate", "coname", "cik", "price_range"])  # Headlines in  top row\n    wr.writerows(price2)\n
Run Code Online (Sandbox Code Playgroud)\n

输出看起来很好,每个子列表都被放置在一个新行中(即每行都以该\xe2\x80\x99S-1\xe2\x80\x99元素开头)。\n在此输入图像描述

\n

为了进一步清理列表,我仍然想删除特殊字符(例如\'&#8194\')。所以我创建了一个新price3列表:

\n
price3 = re.sub(\'<.*?>|&([a-z0-9]+|#[0-9]{1,6}|#x[0-9a-f]{1,6});\', \'\', str(price2)) #remove special characters or html tags in original .txt files\nprint(price3)\n[[\'S-1\', \'20040319\', \'\\t\\t\\t\\tDIGIRAD CORP\', \'\\t\\t0000707388\', \'price to be between $and $per \'], [\'S-1\', \'20040408\', \'\\t\\t\\t\\tBUCYRUS INTERNATIONAL INC\', \'\\t\\t0000740761\'], [\'S-1\', \'20041027\', \'\\t\\t\\t\\tBUCYRUS INTERNATIONAL INC\', \'\\t\\t0000740761\'], [\'S-1\', \'20050630\', \'\\t\\t\\t\\tSEALY CORP\', \'\\t\\t0000748015\'], [\'S-1\', \'20140512\', \'\\t\\t\\t\\tCITIZENS FINANCIAL GROUP INC/RI\', \'\\t\\t0000759944\', \'initial public offering and no public market exists for our shares. We anticipate that the initial public offering price will be between $and\'], [\'S-1\', \'20110523\', \'\\t\\t\\t\\tCeres, Inc.\', \'\\t\\t0000767884\', \'    aggregate capital expenditures will be between $0.3million\'], [\'S-1\', \'20171023\', \'\\t\\t\\t\\tBLUEGREEN VACATIONS CORP\', \'\\t\\t0000778946\', \'        This is the initial public offering of Bluegreen Vacations Corporation. We are offering  shares of our common stock and the selling shareholder identified in this prospectus is offering  shares of our common stock. We will not receive any of the proceeds from the sale of shares by the selling shareholder. We anticipate that the initial public offering price of our common stock will be between $ and $ per \'], [\'S-1\', \'20020813\', \'\\t\\t\\t\\tVISTACARE INC\', \'\\t\\t0000787030\']]\n
Run Code Online (Sandbox Code Playgroud)\n

令我惊讶的是,当我应用代码传输price3csv文件时,所有元素都保留在第一列中。查看输出:

\n

在此输入图像描述

\n

有什么建议么?我看不出错误在哪里...\n非常感谢

\n

Mas*_*fox 5

没有错误,Excel 默认情况下使用 \' ;\' 而不是 \' ,\',然后在您的示例中它将所有值 \xe2\x80\x8b\xe2\x80\x8bin 插入第一列。要正确查看 csv,您必须将 excel 设置分隔符从 \' \ ' 更改为;\' ,\' 或使用分隔符 \' ;\' 保存 csv 文件,如下所示:

\n\n
with open(\'pricerange.csv\', \'w\') as out_file:\n        wr = csv.writer(out_file, delimiter=";")\n        wr.writerow(["file_form", "filedate", "coname", "cik", "price_range"])  # Headlines in  top row\n        wr.writerows(price2)\n
Run Code Online (Sandbox Code Playgroud)\n