你如何使用gawk解析CSV文件?简单设置FS=","是不够的,因为带有逗号的引用字段将被视为多个字段.
使用的示例FS=","不起作用:
文件内容:
one,two,"three, four",five
"six, seven",eight,"nine"
Run Code Online (Sandbox Code Playgroud)
gawk脚本:
BEGIN { FS="," }
{
for (i=1; i<=NF; i++) printf "field #%d: %s\n", i, $(i)
printf "---------------------------\n"
}
Run Code Online (Sandbox Code Playgroud)
输出不好:
field #1: one
field #2: two
field #3: "three
field #4: four"
field #5: five
---------------------------
field #1: "six
field #2: seven"
field #3: eight
field #4: "nine"
---------------------------
Run Code Online (Sandbox Code Playgroud)
期望的输出:
field #1: one
field #2: two
field #3: "three, four"
field #4: five
---------------------------
field #1: "six, seven"
field …Run Code Online (Sandbox Code Playgroud) 这是我的csv快照:
alex 123f 1
harry fwef 2
alex sef 3
alex gsdf 4
alex wf35 6
harry sdfsdf 3
Run Code Online (Sandbox Code Playgroud)
我想得到这个数据的子集,其中第一列(哈里,亚力)中出现的任何东西至少为4.所以我希望得到的数据集是:
alex 123f 1
alex sef 3
alex gsdf 4
alex wf35 6
Run Code Online (Sandbox Code Playgroud)