计数和索引列中的字符

Hud*_*Hud 2 indexing bash awk grep

我有一个system.xyz包含几列的文件:

          43
  Built with Packmol
  O           37.536208       36.873149        9.514500
  C           37.768292       35.784076       10.014380
  N           37.749829       34.667899        9.235406
  C           38.014779       33.336113        9.750827
  C           37.921777       32.283049        8.635104
  C           38.203826       30.885654        9.187454

Run Code Online (Sandbox Code Playgroud)

并需要将它们更改为:

@atom:o1 @mol: $atom:O 0 37.536208 36.873149 9.514500
@atom:c1 @mol: $atom:C 0 37.768292 35.784076 10.014380
@atom:n1 @mol: $atom:N 0 37.749829 34.667899 9.235406
@atom:c2 @mol: $atom:C 0 38.014779 33.336113 9.750827
@atom:c3 @mol: $atom:C 0 37.921777 32.283049 8.635104
@atom:c4 @mol: $atom:C 0 38.203826 30.885654 9.187454
Run Code Online (Sandbox Code Playgroud)

我设法使用了这个 grep -A43 Built system.xyz | awk '{print "@atom:"tolower($1), "@mol: $atom:"$1,"0",$2,$3,$4}'

@atom:built @mol: $atom:Built 0 with Packmol 
@atom:o @mol: $atom:O 0 37.536208 36.873149 9.514500
@atom:c @mol: $atom:C 0 37.768292 35.784076 10.014380
@atom:n @mol: $atom:N 0 37.749829 34.667899 9.235406
@atom:c @mol: $atom:C 0 38.014779 33.336113 9.750827
@atom:c @mol: $atom:C 0 37.921777 32.283049 8.635104
@atom:c @mol: $atom:C 0 38.203826 30.885654 9.187454
Run Code Online (Sandbox Code Playgroud)

但我必须手动输入第一列的每个字符的索引。有没有办法计算和索引第一列中的字符?

gle*_*man 5

尝试这个:

awk '
    BEGIN {fmt = "@atom:%s%d @mol: $atom:%s 0"}
    {$1 = sprintf(fmt, tolower($1), ++count[tolower($1)], $1)}
    1
'
Run Code Online (Sandbox Code Playgroud)