grep只能显示与搜索模式匹配的单词吗？

Question

grep只能显示与搜索模式匹配的单词吗？

有没有办法让grep输出"单词"来自与搜索表达式匹配的文件？

如果我想在许多文件中找到所有实例,比如"th",我可以这样做:

grep "th" *

Run Code Online (Sandbox Code Playgroud)

但是输出会是这样的(粗体是我的);

some-text-file : the cat sat on the mat  
some-other-text-file : the quick brown fox  
yet-another-text-file : i hope this explains it thoroughly

我希望它使用相同的搜索输出是:

the
the
the
this
thoroughly

Run Code Online (Sandbox Code Playgroud)

这可能使用grep吗？或者使用其他工具组合？

Answer 1

Dan*_*ood 860

试试grep -o

grep -oh "\w*th\w*" *

Run Code Online (Sandbox Code Playgroud)

编辑:匹配Phil的评论

来自文档:

-h, --no-filename
    Suppress the prefixing of file names on output. This is the default
    when there is only  one  file  (or only standard input) to search.
-o, --only-matching
    Print  only  the matched (non-empty) parts of a matching line,
    with each such part on a separate output line.

Run Code Online (Sandbox Code Playgroud)

@ user181548,grep -o选项仅适用于GNU grep.因此,如果您不使用GNU grep,它可能不适合您. (8认同)
我需要解释一下 `"\w*th\w*" *` 是什么意思，所以我想我会发帖。`\w` 是 [_[:alnum:]]，所以它基本上匹配任何包含 'th' 的“单词”（因为 `\w` 不包含空格）。引用部分后面的 * 是哪个文件的 glob（即匹配此目录中的所有文件） (6认同)
@ABB这取决于您是否要显示匹配文件的名称.我不确定它在什么条件下显示,但是我知道当我在多个目录中使用grep时它确实显示了所有匹配文件的完整文件路径,而使用-h它只显示了匹配的单词,没有任何关于它是哪个文件的规范.因此,为了匹配原始问题,我认为在某些情况下这是必要的. (5认同)
`\w` 通常不能移植到 `grep -E`；为了适当的可移植性，请改用 POSIX 字符类名称 `[[:alnum:]]`（如果你真的想要下划线，也可以使用 `[_[:alnum:]]`；或者尝试 `grep -P` 如果你平台有那个）。 (3认同)

Answer 2

Pic*_*tor 79

交叉分发安全答案(包括windows minGW？)

grep -h "[[:alpha:]]*th[[:alpha:]]*" 'filename' | tr ' ' '\n' | grep -h "[[:alpha:]]*th[[:alpha:]]*"

Run Code Online (Sandbox Code Playgroud)

如果您使用的旧版本的grep(如2.4.2)不包含-o选项.使用上面的.否则使用更简单的维护版本.

Linux交叉分发安全答案

grep -oh "[[:alpha:]]*th[[:alpha:]]*" 'filename'

Run Code Online (Sandbox Code Playgroud)

总结-oh输出正则表达式匹配文件内容(而不是文件名),就像你期望正则表达式在vim/etc中工作一样......你要搜索的是什么单词或正则表达式,取决于您!只要你保持POSIX而不是perl语法(参见下文)

更多来自grep的手册

-o      Print each match, but only the match, not the entire line.
-h      Never print filename headers (i.e. filenames) with output lines.
-w      The expression is searched for as a word (as if surrounded by
         `[[:<:]]' and `[[:>:]]';

Run Code Online (Sandbox Code Playgroud)

原答案不适用于每个人的原因

\w平台的使用因平台而异,因为它是一种扩展的"perl"语法.因此,那些仅限于使用POSIX字符类的grep安装使用[[:alpha:]]而不是它的perl等价物\w.有关更多信息,请参阅正则表达式的Wikipedia页面

最终,无论grep的平台(原始版本)如何,上面的POSIX答案都会更加可靠

至于没有-o选项的grep支持,第一个grep输出相关的行,tr将空格拆分为新行,最后的grep只过滤相应的行.

(PS:我现在知道大多数平台,都会修补\ w ....但总有那些落后的人)

感谢来自@AdamRosenfield的"-o"解决方法

Answer 3

Ada*_*eld 42

您可以将空格转换为换行符然后grep,例如:

cat * | tr ' ' '\n' | grep th

Run Code Online (Sandbox Code Playgroud)

不需要猫.tr'''\n'<file | grep th.大文件慢. (17认同)
@ ghostdog74如果缓慢的部分是因为`tr`,他可以先做'grep`,所以`tr`只适用于匹配的行:`grep th filename | tr'''\n'| grep th` (3认同)

Answer 4

gho*_*g74 37

只是awk,不需要组合工具.

# awk '{for(i=1;i<=NF;i++){if($i~/^th/){print $i}}}' file
the
the
the
this
thoroughly

Run Code Online (Sandbox Code Playgroud)

@AjeetGanga,这就是名字 (8认同)
Yuo:grep只是这项工作的错误工具. (3认同)

Answer 5

Abh*_*sad 33

它比你想象的更简单.试试这个:

egrep -wo 'th.[a-z]*' filename.txt #### (Case Sensitive)

egrep -iwo 'th.[a-z]*' filename.txt  ### (Case Insensitive)

Run Code Online (Sandbox Code Playgroud)

哪里,

 egrep: Grep will work with extended regular expression.
 w    : Matches only word/words instead of substring.
 o    : Display only matched pattern instead of whole line.
 i    : If u want to ignore case sensitivity.

Run Code Online (Sandbox Code Playgroud)

这似乎没有在 4 年多以前的现有答案上添加任何内容。 (3认同)
@tripleee 我发现我的方法更好更简单，所以我发布了这个。 (3认同)

Answer 6

小智 10

grep命令仅用于匹配和perl

grep -o -P 'th.*? ' filename

Run Code Online (Sandbox Code Playgroud)

那么只显示匹配的组呢？ (3认同)

Answer 7

小智 8

cat *-text-file | grep -Eio "th[a-z]+"

Run Code Online (Sandbox Code Playgroud)

也许还会看到[猫的无用使用？]（/ q / 11710552） (3认同)
或者只是grep -Eio"th [az] +"文件名 (2认同)

Answer 8

Bea*_*eau 8

我对awk难以记住的语法感到不满意,但我喜欢使用一个实用程序来实现这一点.

看起来像ack(如果使用Ubuntu,则为ack-grep)可以轻松完成:

# ack-grep -ho "\bth.*?\b" *

the
the
the
this
thoroughly

Run Code Online (Sandbox Code Playgroud)

如果省略-h标志,则会得到:

# ack-grep -o "\bth.*?\b" *

some-other-text-file
1:the

some-text-file
1:the
the

yet-another-text-file
1:this
thoroughly

Run Code Online (Sandbox Code Playgroud)

作为奖励,您可以使用该--output标志为更复杂的搜索执行此操作,使用我发现的最简单的语法:

# echo "bug: 1, id: 5, time: 12/27/2010" > test-file
# ack-grep -ho "bug: (\d*), id: (\d*), time: (.*)" --output '$1, $2, $3' test-file

1, 5, 12/27/2010

Run Code Online (Sandbox Code Playgroud)

归档时间：	16 年，4 月前
查看次数：	635325 次
最近记录：	6 年，8 月前