如何删除 CSV 文件中特定列中的换行符?

Vic*_*cky 5 sed awk text-processing

我有一个包含 150 多列的 CSV 文件,以换行符作为记录分隔符。问题在于其中一列获得换行符。为此,我想删除它们。

输入:

001|Baker St.
London|3|4|7
002|Penny Lane
Liverpool|88|5|7
Run Code Online (Sandbox Code Playgroud)

输出:

001|Baker St. London|3|4|7
002|Penny Lane Liverpool|88|5|7
Run Code Online (Sandbox Code Playgroud)

Sté*_*las 7

sed只要当前行不包含 4 个|字符,您就可以将下一行合并到当前行中:

<file sed -e :1 -e 's/|/|/4;t' -e 'N;s/\n/ /;b1'
Run Code Online (Sandbox Code Playgroud)

某些sed实现具有-i-i ''就地编辑文件(-i.back以使用.back扩展名保存原始文件),因此对于这些实现,您可以执行以下操作:

sed -i -e :1 -e 's/|/|/4;t' -e 'N;s/\n/ /;b1' ./*.csv
Run Code Online (Sandbox Code Playgroud)

编辑csv当前目录中的所有非隐藏文件。

与评论相同:

<file sed '
   :1
     s/|/|/4; # replace the 4th | with itself. Only useful when combined with
              # the next "t" command which branches off if the previous
              # substitution was successful
     t
     # we only reach this point if "t" above did not branch off, that is
     # if the pattern space does not contain 4 "|"s
     N; # append the next line to the pattern space
     s/\n/ /; # replace the newline with a space

   # and then loop again in case the pattern space still does not contain
   # 4 "|"s:
   b1'
Run Code Online (Sandbox Code Playgroud)


Rom*_*est 3

依赖于第一个字段的格式(假设每行应以数字开头):

awk 'NR == 1{ printf $0; next }
     { printf "%s%s", (/^[0-9]+/? ORS : ""), $0 }
     END{ print "" }' file.csv
Run Code Online (Sandbox Code Playgroud)

输出:

001|Baker St.London|3|4|7
002|Penny LaneLiverpool|88|5|7
Run Code Online (Sandbox Code Playgroud)