使用文本处理工具解析文件

hei*_*man 5 text-processing

一个文件看起来像:

1140.271257 0.002288454025 0.002763420728 0.004142512599 0 0 0 0 0 0 0 0 0 0 0 
1479.704769 0.00146621631 0.003190634646 0.003672029231 0 0 0 0 0 0 0 0 0 0 0 
1663.276205 0.003379552854 0.04643209167 0.0539399155 0 0 0 0 0 0 0 0 0 0 0 0 
Run Code Online (Sandbox Code Playgroud)

我可以使用一些文本处理工具将其拆分为两个文件,例如:

1:

1140.271257 0.002288454025 0.002763420728 0.00414251259
1479.704769 0.00146621631 0.003190634646 0.003672029231
1663.276205 0.003379552854 0.04643209167 0.0539399155
Run Code Online (Sandbox Code Playgroud)

2:

0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0 
Run Code Online (Sandbox Code Playgroud)

只需获取不是 0 的第一个数字,然后将其余数字放入另一个文件中……如果该文件可以命名为带有 x1 和 x2 左右的原始文件名,那就太酷了。

A.B*_*.B. 6

awk. 下面的命令检查每一行中的每个条目并写入不同的文件,在我的示例out1out2. 如果输入文件中有换行符,则输出文件中也会写入换行符。

awk '{for(i=1;i<=NF;i++) {if($i!=0) {printf "%s ",$i > "out1"} else {printf "%s ",$i > "out2"}; if (i==NF) {printf "\n" > "out1"; printf "\n" > "out2"} }}' foo
Run Code Online (Sandbox Code Playgroud)

例子

输入文件

cat foo

1140.271257 0.002288454025 0.002763420728 0.004142512599 0 0 0 0 0 0 0 0 0 0 0 
1479.704769 0.00146621631 0.003190634646 0.003672029231 0 0 0 0 0 0 0 0 0 0 0 
1663.276205 0.003379552854 0.04643209167 0.0539399155 0 0 0 0 0 0 0 0 0 0 0 0
Run Code Online (Sandbox Code Playgroud)

命令

awk '{for(i=1;i<=NF;i++) {if($i!=0) {printf "%s ",$i > "out1"} else {printf "%s ",$i > "out2"}; if (i==NF) {printf "\n" > "out1"; printf "\n" > "out2"} }}' foo
Run Code Online (Sandbox Code Playgroud)

输出文件

cat out1

1140.271257 0.002288454025 0.002763420728 0.004142512599 
1479.704769 0.00146621631 0.003190634646 0.003672029231 
1663.276205 0.003379552854 0.04643209167 0.0539399155 
Run Code Online (Sandbox Code Playgroud)

cat out2

0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 
0 0 0 0 0 0 0 0 0 0 0 0
Run Code Online (Sandbox Code Playgroud)


kos*_*kos 3

您确实可以使用文本处理工具来执行此操作,但如果目的是将前 4 个字段与其后面的字段分开,则使用以下内容cut就足够了:

 cut -d ' ' -f 1-4 infile > outfile1
 cut -d ' ' -f 5- infile > outfile2
Run Code Online (Sandbox Code Playgroud)
 cut -d ' ' -f 1-4 infile > outfile1
 cut -d ' ' -f 5- infile > outfile2
Run Code Online (Sandbox Code Playgroud)


Way*_*Yux 2

我建议使用 perl 来实现此目的。保存您的输入input.txt并运行以下命令:

cat input.txt | perl -ane 'foreach(@F){   #loop through input and split each line into an array
  chomp; #remove trailing newline
  if($_ == 0){   #print the element to STDOUT if it is "0"
    print $_," "
  }
  else{     #print the element to STDERR if it is not "0"
    print STDERR $_," "
    }
  };
  print "\n"; print STDERR "\n";' #add a newline at the end 
> x2.txt 2> x1.txt    #redirect STDOUT to x2.txt and STDERR to x1.txt
Run Code Online (Sandbox Code Playgroud)

这里作为一行复制粘贴:

cat input.txt | perl -ane 'foreach(@F){chomp;if($_ == 0){print $_," "}else{print STDERR $_," "}};print "\n"; print STDERR "\n";' > x2.txt 2> 1.txt
Run Code Online (Sandbox Code Playgroud)