在模式之间的文本文件中对行进行排序

Lfm*_*fm_ 5 python awk sed

我正在尝试在Bash或Python中的模式之间对线进行排序。我想基于第二个字段以“,”作为分隔符对行进行排序。

给定以下文本输入文件:

Sample1
T1,64,0.65  MEDIUM
T2,60,0.45  LOW
T3,301,0.68  MEDIUM
T4,65,0.75  HIGH
T5,59,0.72  MEDIUM
T6,51,0.82  HIGH
Sample2
T1,153,0.77  HIGH
T2,152,0.61  MEDIUM
T3,154,0.67  MEDIUM
T4,283,0.66  MEDIUM
T5,161,0.65  MEDIUM
Sample3
T1,147,0.71  MEDIUM
T2,154,0.63  MEDIUM
T3,45,0.63  MEDIUM
T4,259,0.77  HIGH
Run Code Online (Sandbox Code Playgroud)

我期望作为输出:

Sample1
T6,51,0.82  HIGH
T5,59,0.72  MEDIUM
T2,60,0.45  LOW
T1,64,0.65  MEDIUM
T4,65,0.75  HIGH
T3,301,0.68  MEDIUM
Sample2
T2,152,0.61  MEDIUM
T1,153,0.77  HIGH
T3,154,0.67  MEDIUM
T5,161,0.65  MEDIUM
T4,283,0.66  MEDIUM
Sample3
T3,45,0.63  MEDIUM
T1,147,0.71  MEDIUM
T2,154,0.63  MEDIUM
T4,259,0.77  HIGH
Run Code Online (Sandbox Code Playgroud)

我试图通过另一篇文章中的glenn jackman来适应这个建议,但据我测试,它仅适用于2种模式:

> gawk -v cmd="sort -k2" p=1 '
>     /^PATTERN2/ {          # when we we see the 2nd marker:
>         close("cmd", "to");
>         while (("cmd" |& getline line) >0) print line 
>         p=1
>     }
>     p  {print}             # if p is true, print the line
>     !p {print |& "cmd"}   # if p is false, send the line to `sort`
>     /^PATTERN1/ {p=0}      # when we see the first marker, turn off printing ' FILE
Run Code Online (Sandbox Code Playgroud)

kva*_*our 3

您可以通过以下方式使用 GNU awk 来完成此操作:

$ awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_asc"; FS=","}
       /PATTERN/{
         for(i in a) print i
         delete a
         print; next
       }
       { a[$0]=$2 }
       END{ for(i in a) print i }' file
Run Code Online (Sandbox Code Playgroud)

使用PROCINFO["sorted_in"]="@val_num_asc",我们告诉 GNU awk 以数组元素的值按数字升序出现的方式遍历数组。这个想法是创建一个数组,其中键是整行,值是第二个字段。我们不使用第二个字段作为键,因为可能存在重复项。然而,这仍然可以通过以下方式实现:

$ awk 'BEGIN{PROCINFO["sorted_in"]="@val_num_asc"; FS=","}
       /PATTERN/{
         for(i in a) print a[i]
         delete a
         print; next
       }
       ($2 in a){ a[$2]=a[$2] ORS $0; next }
       { a[$2] = $0 }
       END{ for(i in a) print a[i] }' file
Run Code Online (Sandbox Code Playgroud)