我知道以前有人问过这个问题,但这只是有点不同:我需要删除所有评论,不包括转义#
或其他不意味着开始评论(在单顶点或双顶点之间)
从以下文本开始:
test
# comment
comment on midline # comment
escaped hash "\# this is an escaped hash"
escaped hash "\\# this is not a comment"
not a comment "# this is not a comment - double apices"
not a comment '# this is not a comment - single apices'
this is a comment \\# this is a comment
this is not a comment \# this is not a comment
Run Code Online (Sandbox Code Playgroud)
我想获得
test
comment on midline
escaped hash "\# this is an escaped hash"
escaped hash "\\# this is not a comment"
not a comment "# this is not a comment - double apices"
not a comment '# this is not a comment - single apices'
this is a comment \\
this is not a comment \# this is not a comment
Run Code Online (Sandbox Code Playgroud)
我试过
grep -o '^[^#]*' file
Run Code Online (Sandbox Code Playgroud)
但这也会删除转义的哈希值。
注意:我正在处理的文本确实已转义#
( \#
) 但缺少双重转义#
( \\#
),因此是否保留它们对我来说无关紧要。我想删除它们更简洁,因为事实上哈希没有被转义。
有了sed
你可以删除以启动线#
(零个或多个空格开头),并删除开头的所有字符串#
不遵循一个反斜杠(且仅当它不是在两者之间引号1):
sed '/^[[:blank:]]*#/d
/["'\''].*#.*["'\'']/!{
s/\\\\#.*/\\\\/
s/\([^\]\)#.*/\1/
}' infile
Run Code Online (Sandbox Code Playgroud)
1:此解决方案假定一行中有一对引号