如何使用awk按两个常用分隔符包围的数字对列表进行排序？

Question

如何使用awk按两个常用分隔符包围的数字对列表进行排序？

例如,下面是该文件的示例.我想按照Corr的顺序对所有行进行排序,其中数字前面的分隔符是"=",数字后面的分隔符是"at".

PrecipNH0to90vsNetNH0to90 Corr = -0.5073 at Net Leading Precip by -1 Months Time Lag
PrecipNH0to90vsNetSH0to90 Corr = -0.6498 at Net Leading Precip by 2 Months Time Lag
PrecipNH0to90vsNetHemDif0to90 Corr = 0.66939 at Net Leading Precip by 9 Months Time Lag
PrecipNH0to90vsNetGlobal0to90 Corr = -0.66036 at Net Leading Precip by 0 Months Time Lag
PrecipNH0to90vsNetAsymIndex0to90 Corr = 0.65726 at Net Leading Precip by 0 Months Time Lag
PrecipNH0to90vsNetNH0to14 Corr = -0.46212 at Net Leading Precip by -2 Months Time Lag
PrecipNH0to90vsNetSH0to14 Corr = -0.70731 at Net Leading Precip by 4 Months Time Lag
PrecipNH0to90vsNetHemDif0to14 Corr = 0.70494 at Net Leading Precip by 8 Months Time Lag
PrecipNH0to90vsNetGlobal0to14 Corr = -0.66121 at Net Leading Precip by 0 Months Time Lag
PrecipNH0to90vsNetAsymIndex0to14 Corr = 0.64884 at Net Leading Precip by 8 Months Time Lag
PrecipNH0to90vsNetNH14to30 Corr = 0.46232 at Net Leading Precip by 10 Months Time Lag
PrecipNH0to90vsNetSH14to30 Corr = -0.80044 at Net Leading Precip by 2 Months Time Lag
PrecipNH0to90vsNetHemDif14to30 Corr = 0.74188 at Net Leading Precip by 9 Months Time Lag
PrecipNH0to90vsNetGlobal14to30 Corr = -0.62494 at Net Leading Precip by 2 Months Time Lag
PrecipNH0to90vsNetAsymIndex14to30 Corr = 0.46709 at Net Leading Precip by 5 Months Time Lag
PrecipNH0to90vsNetNH30to49 Corr = 0.49765 at Net Leading Precip by 10 Months Time Lag
PrecipNH0to90vsNetSH30to49 Corr = 0.21001 at Net Leading Precip by 10 Months Time Lag

Run Code Online (Sandbox Code Playgroud)

我知道当我从Matlab打印出来时,文件可以更整齐地组织起来,但我仍然对此作为一般情况感到好奇.

Answer 1

Joh*_*n C 5

试试这个:

sort -nk4,4 <filename>

Run Code Online (Sandbox Code Playgroud)

或者,如果你真的爱awk:

awk '{print $4}' <filename> | sort -n

Run Code Online (Sandbox Code Playgroud)

sort -nk4 =仅在第4个字段上以数字(n)排序(k4,4)

awk - {print $ 4} =仅打印第4个字段.Awk会自动按空格分割.

最后,为了好玩,我做了一个只使用awk来实现它自己的冒泡排序的版本.:-)它可能有点清洁,但它的工作原理.

#!/usr/bin/awk -f
# Script to sort a data file based on column 4
{
  # Read every line into an array
  line[NR]  = $0
  # Also save the sort column so we don't have to split the line again repeatedly
  value[NR] = $4
}
END { # sort it with bubble sort
  do {
    haschanged = 0
    for(i=1; i < NR; i++) {
      if ( value[i] > value[i+1] ) {
        # Swap adjacent lines and values.
        t = line[i]
        line[i] = line[i+1]
        line[i+1] = t
        t = value[i]
        value[i] = value[i+1]
        value[i+1] = t
        haschanged = 1
      }
    }
  } while ( haschanged == 1 )
  # Print out the result.
  for(i=1; i <= NR; i++) {
    print line[i]
  }
}

Run Code Online (Sandbox Code Playgroud)

`sort -k4`不根据第4个字段排序(仅限) - 它从第4个字段排序到第3个字段.由于`-n`(数字排序),在这种情况下,_happens_仅限于第4个字段,因为只有在字符串被识别为_number_时才执行解析行的其余部分.通常,要真正只按字段4排序,请使用`sort -k4,4`.使用GNU`sort`,您可以使用`sort --debug -k4`与`sort --debug -k4,4`来查看此行为.`sort`是一个奇怪的野兽. (3认同)

归档时间：	11 年，8 月前
查看次数：	417 次
最近记录：	11 年，8 月前