如何使用bash脚本在一个文件中找到行而不在另一个文件中找到行？

Question

如何使用bash脚本在一个文件中找到行而不在另一个文件中找到行？

想象一下文件1:

#include "first.h"
#include "second.h"
#include "third.h"

// more code here
...

Run Code Online (Sandbox Code Playgroud)

想象一下文件2:

#include "fifth.h"
#include "second.h"
#include "eigth.h"

// more code here
...

Run Code Online (Sandbox Code Playgroud)

我想获取文件2中包含的标题,但不是文件1中的标题,只是那些行.因此,当运行时,文件1和文件2的差异将产生:

#include "fifth.h"
#include "eigth.h"

Run Code Online (Sandbox Code Playgroud)

我知道如何在Perl/Python/Ruby中实现它,但我想在不使用不同编程语言的情况下完成此任务.

Answer 1

gle*_*man 25

这是一个单行,但不保留顺序:

comm -13 <(grep '#include' file1 | sort) <(grep '#include' file2 | sort)

Run Code Online (Sandbox Code Playgroud)

如果您需要保留订单:

awk '
  !/#include/ {next} 
  FILENAME == ARGV[1] {include[$2]=1; next} 
  !($2 in include)
' file1 file2

Run Code Online (Sandbox Code Playgroud)

Answer 2

Fra*_*itt 9

如果可以使用临时文件,请尝试以下方法:

grep include file1.h > /tmp/x && grep -f /tmp/x -v file2.h | grep include

Run Code Online (Sandbox Code Playgroud)

这个

从中提取所有包含file1.h并将其写入文件/tmp/x
使用此文件来获取file2.h此列表中未包含的所有行
从剩余部分中提取所有包含 file2.h

但是,它可能无法正确处理空白等方面的差异.

编辑:为了防止误报,为最后一个grep使用不同的模式(感谢jw013提到这个):

grep include file1.h > /tmp/x && grep -f /tmp/x -v file2.h | grep "^#include"

Run Code Online (Sandbox Code Playgroud)

Answer 3

tri*_*eee 8

此变体需要fgrep带有-f选项.GNU grep(即任何Linux系统,然后一些)应该可以正常工作.

# Find occurrences of '#include' in file1.h
fgrep '#include' file1.h |
# Remove any identical lines from file2.h
fgrep -vxf - file2.h |
# Result is all lines not present in file1.h.  Out of those, extract #includes
fgrep '#include'

Run Code Online (Sandbox Code Playgroud)

这不需要任何排序,也不需要任何显式临时文件.从理论上讲,fgrep -f可以在幕后使用临时文件,但我相信GNU fgrep没有.

Answer 4

小智 6

如果目标不必使用bash单独(即,使用外部程序是可以接受的)来完成,然后使用combine从moreutils:

combine file1 not file2 > lines_in_file1_not_in_file2

Run Code Online (Sandbox Code Playgroud)

归档时间：	14 年，6 月前
查看次数：	10241 次
最近记录：	11 年前