如何递归比较两个目录并检查其中一个目录是否包含另一个目录?

ken*_*enn 2 command-line bash

我有两个目录,它们包含公共文件。我想知道一个目录是否包含与另一个目录相同的文件。我在网上找到了一个脚本,但我想改进它以递归执行。

  #!/bin/bash

  # cmp_dir - program to compare two directories

  # Check for required arguments
  if [ $# -ne 2 ]; then
      echo "usage: $0 directory_1 directory_2" 1>&2
      exit 1
  fi

  # Make sure both arguments are directories
  if [ ! -d $1 ]; then
      echo "$1 is not a directory!" 1>&2
      exit 1
  fi

  if [ ! -d $2 ]; then
      echo "$2 is not a directory!" 1>&2
      exit 1
  fi

  # Process each file in directory_1, comparing it to directory_2
  missing=0
  for filename in $1/*; do
      fn=$(basename "$filename")
      if [ -f "$filename" ]; then
          if [ ! -f "$2/$fn" ]; then
              echo "$fn is missing from $2"
              missing=$((missing + 1))
          fi
      fi
  done
  echo "$missing files missing"
Run Code Online (Sandbox Code Playgroud)

有人会建议一个算法吗?

Joh*_*024 7

#!/bin/bash

# cmp_dir - program to compare two directories

# Check for required arguments
if [ $# -ne 2 ]; then
  echo "usage: $0 directory_1 directory_2" 1>&2
  exit 1
fi

# Make sure both arguments are directories
if [ ! -d "$1" ]; then
  echo "$1 is not a directory!" 1>&2
  exit 1
fi

if [ ! -d "$2" ]; then
  echo "$2 is not a directory!" 1>&2
  exit 1
fi

# Process each file in directory_1, comparing it to directory_2
missing=0
while IFS= read -r -d $'\0' filename
do
  fn=${filename#$1}
  if [ ! -f "$2/$fn" ]; then
      echo "$fn is missing from $2"
      missing=$((missing + 1))
  fi
done < <(find "$1" -type f -print0)

echo "$missing files missing"
Run Code Online (Sandbox Code Playgroud)

请注意,我已经加入周围双引号$1,并$2在不同的地方之上,以保护他们shell扩展。如果没有双引号,带有空格或其他困难字符的目录名称会导致错误。

关键循环现在显示:

while IFS= read -r -d $'\0' filename
do
  fn=${filename#$1}
  if [ ! -f "$2/$fn" ]; then
      echo "$fn is missing from $2"
      missing=$((missing + 1))
  fi
done < <(find "$1" -type f -print0)
Run Code Online (Sandbox Code Playgroud)

这用于find递归地深入目录$1并查找文件名。该构造while IFS= read -r -d $'\0' filename; do .... done < <(find "$1" -type f -print0)对所有文件名都是安全的。

basename不再使用,因为我们正在查看子目录中的文件,我们需要保留子目录。因此,使用basename线路代替对 的调用fn=${filename#$1}。这只是从filename包含目录的前缀中删除$1

问题二

假设我们按名称匹配文件,但不考虑目录。换句话说,如果第一个目录包含一个文件a/b/c/some.txt,如果文件some.txt存在于第二个目录的任何子目录中,我们将认为它存在于第二个目录中。为此,将上面的循环替换为:

while IFS= read -r -d $'\0' filename
do
  fn=$(basename "$filename")
  if ! find "$2" -name "$fn" | grep -q . ; then
      echo "$fn is missing from $2"
      missing=$((missing + 1))
  fi
done < <(find "$1" -type f -print0)
Run Code Online (Sandbox Code Playgroud)