比较生成的可执行文件以获得等效性

Question

比较生成的可执行文件以获得等效性

Luc*_*ano 8 comparison executable shared-objects binary-reproducibility

我需要比较使用相同编译器/标志编译的2个可执行文件和/或共享对象,并验证它们没有更改.我们在受监管的环境中工作,因此对于测试目的而言,确切地隔离可执行文件的哪些部分已经发生变化非常有用.

由于包含有关文件信息的标头,因此使用MD5Sums/Hashes不起作用.

有没有人知道一个程序或方法来验证2个文件是否在执行上是相同的,即使它们是在不同的时间构建的？

Answer 1

一个有趣的问题。我在 linux 上也有类似的问题。如果可执行文件的哈希和突然发生变化，诸如 OSSEC 或 tripwire 之类的入侵检测系统可能会产生误报。这可能比 Linux 的“prelink”程序修补可执行文件以加快启动速度更糟糕。

为了比较两个二进制文件（ELF 格式），可以使用“readelf”可执行文件，然后使用“diff”来比较输出。我确信有完善的解决方案，但不用多说，Perl 中的穷人比较器：

#!/usr/bin/perl -w

$exe = $ARGV[0];

if (!$exe) {
   die "Please give name of executable\n"
}
if (! -f $exe) {
   die "Executable $exe not found or not a file\n";
}
if (! (`file '$exe'` =~ /\bELF\b.*?\bexecutable\b/)) {
   die "file command says '$exe' is not an ELF executable\n";
}

# Identify sections in ELF

@lines = pipeIt("readelf --wide --section-headers '$exe'");

@sections = ();

for my $line (@lines) {
   if ($line =~ /^\s*\[\s*(\d+)\s*\]\s+(\S+)/) {
      my $secnum = $1;
      my $secnam = $2;
      print "Found section $1 named $2\n";
      push @sections, $secnam;
   }
}

# Dump file header

@lines = pipeIt("readelf --file-header --wide '$exe'");
print @lines;

# Dump all interesting section headers

@lines = pipeIt("readelf --all --wide '$exe'");
print @lines;

# Dump individual sections as hexdump

for my $section (@sections) {
   @lines = pipeIt("readelf --hex-dump='$section' --wide '$exe'");
   print @lines;
}

sub pipeIt {
   my($cmd) = @_;
   my $fh;
   open ($fh,"$cmd |") or die "Could not open pipe from command '$cmd': $!\n";
   my @lines = <$fh>;
   close $fh or die "Could not close pipe to command '$cmd': $!\n";
   return @lines;
}

Run Code Online (Sandbox Code Playgroud)

例如，现在您可以在机器 1 上运行：

./checkexe.pl /usr/bin/curl > curl_machine1

Run Code Online (Sandbox Code Playgroud)

在机器 2 上：

./checkexe.pl /usr/bin/curl > curl_machine2

Run Code Online (Sandbox Code Playgroud)

在将文件复制粘贴、SFTP-ed 或 NSF-ed（您不使用 FTP，是吗？）后，将文件复制到同一个文件树中，比较文件：

diff --side-by-side --width=200 curl_machine1 curl_machine2 | less

Run Code Online (Sandbox Code Playgroud)

就我而言，“.gnu.conflict”、“.gnu.liblist”、“.got.plt”和“.dynbss”部分存在差异，这对于“预链接”干预可能没问题，但在代码部分, ".text"，这将是一个坏兆头。

Answer 2

Luc*_*ano 1

为了跟进，这是我最终想出的：

我们没有比较最终的可执行文件和共享对象，而是比较了链接之前的 .o 文件输出。我们假设链接过程具有足够的可重复性，这样就可以了。

它适用于我们的某些情况，其中我们有两个构建，我们做了一些小更改，这些更改不会影响最终代码（代码漂亮的打印机），但如果我们没有构建中间输出，则对我们没有帮助。

归档时间：	15 年，9 月前
查看次数：	5348 次
最近记录：	7 年，7 月前