如何删除 CSV 文件中特定列中的换行符？

Question

如何删除 CSV 文件中特定列中的换行符？

我有一个包含 150 多列的 CSV 文件，以换行符作为记录分隔符。问题在于其中一列获得换行符。为此，我想删除它们。

输入：

001|Baker St.
London|3|4|7
002|Penny Lane
Liverpool|88|5|7

Run Code Online (Sandbox Code Playgroud)

输出：

001|Baker St. London|3|4|7
002|Penny Lane Liverpool|88|5|7

Run Code Online (Sandbox Code Playgroud)

Answer 1

Sté*_*las 7

sed只要当前行不包含 4 个|字符，您就可以将下一行合并到当前行中：

<file sed -e :1 -e 's/|/|/4;t' -e 'N;s/\n/ /;b1'

Run Code Online (Sandbox Code Playgroud)

某些sed实现具有-i或-i ''就地编辑文件（-i.back以使用.back扩展名保存原始文件），因此对于这些实现，您可以执行以下操作：

sed -i -e :1 -e 's/|/|/4;t' -e 'N;s/\n/ /;b1' ./*.csv

Run Code Online (Sandbox Code Playgroud)

编辑csv当前目录中的所有非隐藏文件。

与评论相同：

<file sed '
   :1
     s/|/|/4; # replace the 4th | with itself. Only useful when combined with
              # the next "t" command which branches off if the previous
              # substitution was successful
     t
     # we only reach this point if "t" above did not branch off, that is
     # if the pattern space does not contain 4 "|"s
     N; # append the next line to the pattern space
     s/\n/ /; # replace the newline with a space

   # and then loop again in case the pattern space still does not contain
   # 4 "|"s:
   b1'

Run Code Online (Sandbox Code Playgroud)

Answer 2

Rom*_*est 3

依赖于第一个字段的格式（假设每行应以数字开头）：

awk 'NR == 1{ printf $0; next }
     { printf "%s%s", (/^[0-9]+/? ORS : ""), $0 }
     END{ print "" }' file.csv

Run Code Online (Sandbox Code Playgroud)

输出：

001|Baker St.London|3|4|7
002|Penny LaneLiverpool|88|5|7

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，4 月前
查看次数：	6578 次
最近记录：	7 年，4 月前