我有四个文件,每个文件 10 行,如何获得如下输出

Dra*_*ana -2 scripting perl shell-script

我有4个文件。我需要检查所有文件的行数是否相同。

如果行数不同,我需要检测它并输出,例如:

#file1 - 10 lines, file2 - 9 lines, file3 - 10 lines, file4 - 10 lines
Line are miss matched
Number of lines 10 = 9 = 10 = 10
Run Code Online (Sandbox Code Playgroud)

如果它们相等,我想逐行合并文件,如下所示:

文件:

#file1
10 
12
11

#file2
Arun
kamal
babu

#file3
300
200
400

#file4
spot1
spot4
spot5
Run Code Online (Sandbox Code Playgroud)

输出:

Set1
10
Arun
300
spot1

Set2
12
kamal
200
spot4

Set3
11
babu
400
spot5
Run Code Online (Sandbox Code Playgroud)

我的代码:

#

id_name=`cat file2`
echo $id_name

id_list=`cat file1`
echo $id_list

#

id_count=`cat file3`
echo $id_count

id_spot=`cat spot_list`
echo $id_spot


SS=`cat id_list | wc -l`
DS=`cat id_name | wc -l`
SF=`cat id_count | wc -l`
DF=`cat id_spot | wc -l`

if [ $SS == $DS == $SF == $DF ] then

   echo " Line are matched"
   echo " Total line $SS"


   for i j in $id_list $id_name
   do
      for a b in $id_count $id_spot
      do
         k = 1
         echo " Set$k"
         $i
         $j
         $a
         $b
      done
   done

else

   echo " Line are Miss matched"
   echo " Total line $SS  = $DS = $SF = $DF"

fi
Run Code Online (Sandbox Code Playgroud)

Pes*_*The 7

With a really straightforward approach:

#!/usr/bin/env bash

SS=$(wc -l < file1)
DS=$(wc -l < file2)
SF=$(wc -l < file3)
DF=$(wc -l < file4)


if [[ $SS -eq $DS && $DS -eq $SF && $SF -eq $DF ]]; then 
   echo "Lines are matched"
   echo "Total number of lines: $SS"

   num=1
   while (( num <= SS )); do
      echo "Set$num"
      tail -n +$num file1 | head -n 1
      tail -n +$num file2 | head -n 1
      tail -n +$num file3 | head -n 1
      tail -n +$num file4 | head -n 1

      ((num++))
      echo
   done

else
   echo "Line are miss matched"
   echo "Number of lines $SS = $DS = $SF = $DF"
fi
Run Code Online (Sandbox Code Playgroud)

It is not very efficient as it calls tail 4*number_of_lines times but it is straightforward.


Another approach is to replace the while loop with awk:

awk '{
   printf("\nSet%s\n", NR)
   print; 
   if( getline < "file2" )
      print
   if( getline < "file3" )
      print
   if ( getline < "file4" )
      print
}' file1
Run Code Online (Sandbox Code Playgroud)

To join files line by line, the paste command is very useful. You can use this instead of the while loop:

paste -d$'\n' file1 file2 file3 file4
Run Code Online (Sandbox Code Playgroud)

Or maybe a little less obvious:

{ cat -n file1 ; cat -n file2 ; cat -n file3; cat -n file4; }  | sort -n  | cut -f2-
Run Code Online (Sandbox Code Playgroud)

That will output the lines but with no formatting (no Set1, Set2, newlines, ...), so you have to format it afterwards with awk, for example:

awk '{ 
   if ((NR-1)%4 == 0) 
      printf("\nSet%s\n", (NR+3)/4) 
   print 
}' < <(paste -d$'\n' file1 file2 file3 file4)
Run Code Online (Sandbox Code Playgroud)

Some final notes:

  • Do not use uppercase variables as they could collide with environment and internal shell variables
  • Do not use echo "$var" | cmd or cat file | cmd when you can redirect input: cmd <<< "$var" or cmd < file
  • You can have only one variable name in for loop. for i in ... is valid, whereas for i j in ... is not
  • It is better to use [[ ]] instead of [ ] for testing, see this answer
  • There are a lot of ways to do this
  • It's up to you which approach you choose to use but be aware of the efficiency differences:

Results of time, tested on files with 10000 lines:

#first approach
real    0m45.387s
user    0m5.904s
sys     0m3.836s
Run Code Online (Sandbox Code Playgroud)
#second approach - significantly faster
real    0m0.086s
user    0m0.024s
sys     0m0.040s
Run Code Online (Sandbox Code Playgroud)
#third approach - very close to second approach
real    0m0.074s
user    0m0.016s
sys     0m0.036s
Run Code Online (Sandbox Code Playgroud)


gle*_*man 5

你能弄清楚如何检查的行数为每个文件(提示:wc

要获得集合的输出:

paste File{1,2,3,4} | awk -F'\t' -v OFS='\n' '{$1=$1; print "Set"NR, $0, ""}'
Run Code Online (Sandbox Code Playgroud)

$1=$1 用于将输入字段分隔符转换为输出字段分隔符。