使用sed,awk或perl从行中提取特定模式

Gil*_*Gil 7 perl awk grep sed nawk

sed如果我需要提取由特定模式包围的模式(如果它存在于一行中),我可以使用吗?

假设我有一个包含以下行的文件:

有许多人不敢为邻居所说的[/恐惧/]而自杀.

当我们已经知道/*答案时,建议就是我们所要求的,*/但希望我们没有.

在这两种情况下,我必须扫描线以获得第一个出现的模式,即在各自的情况下为"[ //*",并存储以下模式,直到分别退出模式,即' /]'或' */.

简而言之,我需要fearanswer.如果可能,可以扩展多行;从某种意义上说,如果退出模式出现在不同于同一行的行中.

欢迎以建议或算法的形式提供任何形式的帮助.在此先感谢您的回复

TLP*_*TLP 4

use strict;\nuse warnings;\n\nwhile (<DATA>) {\n    while (m#/(\\*?)(.*?)\\1/#g) {\n        print "$2\\n";\n    }\n}\n\n\n__DATA__\nThere are many who dare not kill themselves for [/fear/] of what the neighbors will say.\nAdvice is what we ask for when we already know the /* answer */ but wish we didn\xe2\x80\x99t.\n
Run Code Online (Sandbox Code Playgroud)\n\n

作为单行:

\n\n
perl -nlwe \'while (m#/(\\*?)(.*?)\\1/#g) { print $2 }\' input.txt\n
Run Code Online (Sandbox Code Playgroud)\n\n

内部 while 循环将在所有带有/g修饰符的匹配之间进行迭代。反向引用\\1将确保我们只匹配相同的打开/关闭标签。

\n\n

如果您需要匹配跨越多行的块,则需要吸收输入:

\n\n
use strict;\nuse warnings;\n\n$/ = undef;\nwhile (<DATA>) {\n    while (m#/(\\*?)(.*?)\\1/#sg) {\n        print "$2\\n";\n    }\n}\n\n__DATA__\n    There are many who dare not kill themselves for [/fear/] of what the neighbors will say. /* foofer */ \n    Advice is what we ask for when we already know the /* answer */ but wish we didn\xe2\x80\x99t.\nfoo bar /\nbaz \nbaaz / fooz\n
Run Code Online (Sandbox Code Playgroud)\n\n

单线:

\n\n
perl -0777 -nlwe \'while (m#/(\\*?)(.*?)\\1/#sg) { print $2 }\' input.txt\n
Run Code Online (Sandbox Code Playgroud)\n\n

开关-0777$/ = undef将导致文件读取,这意味着所有文件都被读入标量。我还添加了/s修饰符以允许通配符.匹配换行符。

\n\n

正则表达式的解释:m#/(\\*?)(.*?)\\1/#sg

\n\n
m#              # a simple m//, but with # as delimiter instead of slash\n    /(\\*?)      # slash followed by optional *\n        (.*?)   # shortest possible string of wildcard characters\n    \\1/         # backref to optional *, followed by slash\n#sg             # s modifier to make . match \\n, and g modifier \n
Run Code Online (Sandbox Code Playgroud)\n\n

这里的“魔力”在于,*仅当在其之前找到一个星号时,反向引用才需要一个星号。

\n

  • 它会匹配多行吗? (2认同)