从包含 <string> 的每一行中提取唯一的字符串

0 grep sed awk text-processing

这是来自文件的示例文本块:


Now is the time for all blah:1; to come to the aid
Now is the time for all blah:1; to come to the aid
Now is the time for all blah:1; to come to the aid
Now is the time for all blah:10; to come to the aid
Go to your happy place  blah:100; to come to the aid
Go to your happy place  blah:4321; to come to the aid
Go to your happy place  blah:4321; to come to the aid
Now is the time for all blah:4321; to come to the aid
Now is the time for all blah:9876; to come to the aid
Now is the time for all blah:108636; to come to the aid
Now is the time for all blah:1194996; to come to the aid
Run Code Online (Sandbox Code Playgroud)

问题:如何从包含“是”的行中提取所有唯一数字?

我试过使用grep -o -P -u '(?<=blah:).*(?=;)' 但它不喜欢分号

gle*_*man 5

您正在寻找\K忘记刚刚匹配的内容的指令。

grep -oP 'is the.*?blah:\K\d+'
Run Code Online (Sandbox Code Playgroud)

然后 sort -u