我一直在用FPAT愉快地使用gawk.这是我用于示例的脚本:
#!/usr/bin/gawk -f
BEGIN {
FPAT="([^,]*)|(\"[^\"]+\")"
}
{
for (i=1; i<=NF; i++) {
printf "Record #%s, field #%s: %s\n", NR, i, $i
}
}
Run Code Online (Sandbox Code Playgroud)
效果很好.
$ echo 'a,b,c,d' | ./test.awk
Record #1, field #1: a
Record #1, field #2: b
Record #1, field #3: c
Record #1, field #4: d
Run Code Online (Sandbox Code Playgroud)
效果很好.
$ echo '"a","b",c,d' | ./test.awk
Record #1, field #1: "a"
Record #1, field #2: "b"
Record #1, field #3: c
Record #1, field #4: d
Run Code Online (Sandbox Code Playgroud)
效果很好.
$ echo '"a","b",,d' | ./test.awk
Record #1, field #1: "a"
Record #1, field #2: "b"
Record #1, field #3:
Record #1, field #4: d
Run Code Online (Sandbox Code Playgroud)
效果很好.
$ echo '"""a"": aaa","b",,d' | ./test.awk
Record #1, field #1: """a"": aaa"
Record #1, field #2: "b"
Record #1, field #3:
Record #1, field #4: d
Run Code Online (Sandbox Code Playgroud)
失败.
$ echo '"""a"": aaa,","b",,d' | ./test.awk
Record #1, field #1: """a"": aaa
Record #1, field #2: ","
Record #1, field #3: b"
Record #1, field #4:
Record #1, field #5: d
Run Code Online (Sandbox Code Playgroud)
预期产量:
$ echo '"""a"": aaa,","b",,d' | ./test_that_would_be_working.awk
Record #1, field #1: """a"": aaa,"
Record #1, field #2: "b"
Record #1, field #4:
Record #1, field #5: d
Run Code Online (Sandbox Code Playgroud)
是否有一个FPAT的正则表达式可以使这个工作,或者这只是不支持awk?
"除了一个之外,该模式之后将是任何东西".正则表达式类搜索一次只能处理一个字符,因此它不能与a匹配"".
我认为可能有一个选择,但我不能很好地使它成功.
因为 awk 的 FPAT 不知道环视,所以您需要明确您的模式。这个会做:
FPAT="[^,\"]*|\"([^\"]|\"\")*\""
Run Code Online (Sandbox Code Playgroud)
解释:
[^,\"]* # match 0 or more times any character except , and "
| # OR
\" # match '"'
([^\"] # followed by 0 or more anything but '"'
| # OR
\"\" # '""'
)*
\" # ending with '"'
Run Code Online (Sandbox Code Playgroud)
现在测试一下:
$ cat tst.awk
BEGIN {
FPAT="[^,\"]*|\"([^\"]|\"\")*\""
}
{
for (i=1; i<=NF; i++){ printf "Record #%s, field #%s: %s\n", NR, i, $i }
}
$ echo '"""a"": aaa,","b",,d' | awk -f tst.awk
Record #1, field #1: """a"": aaa,"
Record #1, field #2: "b"
Record #1, field #3:
Record #1, field #4: d
Run Code Online (Sandbox Code Playgroud)