4 grep quoting regular-expression escape-characters
我在 json 中有一个 json 字符串。这被多次编码,我最终得到了许多转义强烈反对:\\\"
.
大大缩短的字符串看起来像,
'[{"testId" : "12345", "message": "\\\"the status is pass\\\" comment \\\\\"this is some weird encoding\\\\\""}]'
Run Code Online (Sandbox Code Playgroud)
我正在尝试 grep 并获取模式的出现次数\\\"
而不是\\\\\"
?
我该怎么做?
任何 shell/python 解决方案都很好。在python中,使用搜索字符串
search_string = r"""\\\\\"""
,抛出unexpected EOF
错误。
Sté*_*las 12
\\\"
一行中的任何位置:grep -F '\\\"'
Run Code Online (Sandbox Code Playgroud)
也就是说,-F
用于固定字符串搜索而不是正则表达式匹配(反斜杠是特殊的)。并使用强引号 ( '...'
),其中反斜杠并不特殊。
没有-F
,您需要将反斜杠加倍:
grep '\\\\\\"'
Run Code Online (Sandbox Code Playgroud)
或使用:
grep '\\\{3\}"'
grep -E '\\{3}"'
grep -E '[\]{3}"'
Run Code Online (Sandbox Code Playgroud)
在双引号内,您需要另一个级别的反斜杠,并"
使用反斜杠转义:
# 1
# 1234567890123
grep "\\\\\\\\\\\\\""
Run Code Online (Sandbox Code Playgroud)
backslash is another shell quoting operator. So you can also quote those backslash and "
characters with backslash:
\g\r\e\p \\\\\\\\\\\\\"
Run Code Online (Sandbox Code Playgroud)
I've even quoted the characters of grep
above though that's not necessary (as none of g
, r
, e
, p
are special to the shell (except in the Bourne shell if they appear in $IFS
). The only character I've not quoted is the space character, as we do need its special meaning in the shell: separate arguments.
\\\"
provided it's not preceded by another backslashgrep -e '^\\\\\\"' -e '[^\]\\\\\\"'
Run Code Online (Sandbox Code Playgroud)
That is, look for \\\"
at the beginning of the line, or following a character other than backslash.
That time, we have to use a regular expression, a fixed-string search won't do.
grep
returns the lines that match any of those expressions. You can also write it with one expression per line:
grep '^\\\\\\"
[^\]\\\\\\"'
Run Code Online (Sandbox Code Playgroud)
Or with only one expression:
grep '^\(.*[^\]\)\{0,1\}\\\{3\}"' # BRE
grep -E '^(.*[^\])?\\{3}"' # ERE equivalent
grep -E '(^|[^\])\\{3}"'
Run Code Online (Sandbox Code Playgroud)
With GNU grep
built with PCRE support, you can use a look-behind negative assertion:
grep -P '(?<!\\)\\{3}"'
Run Code Online (Sandbox Code Playgroud)
To get a count of the lines that match the pattern (that is, that have one or more occurrences of \\\"
), you'd add the -c
option to grep
. If however you want the number of occurrences, you can use the GNU specific -o
option (though now also supported by a few other implementations) to print all the matches one per line, and then pipe to wc -l
to get a line-count:
grep -Po '(?<!\\)\\{3}"' | wc -l
Run Code Online (Sandbox Code Playgroud)
Or standardly/POSIXly, use awk
instead:
awk '{n+=gsub(/(^|[^\\])\\{3}"/,"")};END{print 0+n}'
Run Code Online (Sandbox Code Playgroud)
(awk
's gsub()
substitutes and returns the number of substitutions).