dg7*_*g72 1 sed awk text-processing regular-expression
我需要从这样的文件中删除第一列:
165 1 chr22 42090593 0 1 chr22 42090609 1 42 42
166 1 chr22 42090593 0 1 chr22 42090654 1 42 42
167 1 chr22 42090595 0 1 chr22 42090633 1 42 42
168 0 chr22 42090612 0 1 chr22 42090656 1 42 42
169 0 chr22 42090614 0 0 chr22 42090617 1 40 42
170 0 chr22 42090647 0 1 chr22 42090749 1 42 42
171 1 chr22 42090684 0 1 chr22 42090692 1 42 42
172 1 chr22 42090733 0 1 chr22 42090743 1 42 42
173 1 chr22 42090733 0 1 chr22 42090775 1 42 42
174 1 chr22 42090733 0 1 chr22 42090789 1 42 42
175 1 chr22 42090757 0 1 chr22 42090787 1 42 24
176 0 chr22 42090778 0 0 chr22 42090790 1 42 42
177 0 chr22 42090800 0 0 chr22 42090802 1 42 42
178 0 chr22 42090803 0 0 chr22 42090806 1 42 42
Run Code Online (Sandbox Code Playgroud)
命令
awk '{$1=""; print $0}'
Run Code Online (Sandbox Code Playgroud)
正确删除第一列但以这种方式改变格式
1 chr22 51178322 0 0 chr22 51178659 1 42 42
0 chr22 51178661 0 0 chr22 51178663 1 42 42
0 chr22 51178667 0 1 chr22 51178790 1 42 23
1 chr22 51178755 0 0 chr22 51178764 1 42 42
0 chr22 51178808 0 1 chr22 51178871 1 42 42
1 chr22 51178869 0 1 chr22 51178895 1 42 42
1 chr22 51178881 0 1 chr22 51178893 1 42 42
1 chr22 51178881 0 1 chr22 51178895 1 42 42
1 chr22 51179213 0 1 chr22 51179213 1 42 42
1 chr22 51180087 0 1 chr22 51180093 1 42 42
1 chr22 51180134 0 0 chr22 51181889 1 42 42
0 chr22 51186192 0 0 chr22 51186192 1 42 42
0 chr22 51186192 0 0 chr22 51186192 1 42 42
Run Code Online (Sandbox Code Playgroud)
任何的想法?
你的方法有两个问题。首先,它看起来像一个制表符分隔的文件,并且您没有告诉 awk 使用制表符。其次,当您""
在 awk 中设置一个字段时,您不是在删除该字段,而是在清空它。所以它仍然被打印出来,这就是为什么在输出的每一行的开头都有一个额外的空格。
所以,如果你想在 awk 中做到这一点,你需要这样的东西(假设你的例子中的前导空格实际上不是你的文件的一部分):
$ awk -F"\t" 'BEGIN{OFS="\t"}{for(i=2;i<NF;i++){printf "%s%s",$i,OFS} print $NF}' file
1 chr22 42090593 0 1 chr22 42090609 1 42 42
1 chr22 42090593 0 1 chr22 42090654 1 42 42
1 chr22 42090595 0 1 chr22 42090633 1 42 42
0 chr22 42090612 0 1 chr22 42090656 1 42 42
0 chr22 42090614 0 0 chr22 42090617 1 40 42
0 chr22 42090647 0 1 chr22 42090749 1 42 42
1 chr22 42090684 0 1 chr22 42090692 1 42 42
1 chr22 42090733 0 1 chr22 42090743 1 42 42
1 chr22 42090733 0 1 chr22 42090775 1 42 42
1 chr22 42090733 0 1 chr22 42090789 1 42 42
1 chr22 42090757 0 1 chr22 42090787 1 42 24
0 chr22 42090778 0 0 chr22 42090790 1 42 42
0 chr22 42090800 0 0 chr22 42090802 1 42 42
0 chr22 42090803 0 0 chr22 42090806 1 42 42
Run Code Online (Sandbox Code Playgroud)
但是其他工具,就像cut
已经提到的那样,在这里更简单。如果您的文件是制表符分隔的,您可以这样做:
$ cut -f2- file
1 chr22 42090593 0 1 chr22 42090609 1 42 42
1 chr22 42090593 0 1 chr22 42090654 1 42 42
1 chr22 42090595 0 1 chr22 42090633 1 42 42
0 chr22 42090612 0 1 chr22 42090656 1 42 42
0 chr22 42090614 0 0 chr22 42090617 1 40 42
0 chr22 42090647 0 1 chr22 42090749 1 42 42
1 chr22 42090684 0 1 chr22 42090692 1 42 42
1 chr22 42090733 0 1 chr22 42090743 1 42 42
1 chr22 42090733 0 1 chr22 42090775 1 42 42
1 chr22 42090733 0 1 chr22 42090789 1 42 42
1 chr22 42090757 0 1 chr22 42090787 1 42 24
0 chr22 42090778 0 0 chr22 42090790 1 42 42
0 chr22 42090800 0 0 chr22 42090802 1 42 42
0 chr22 42090803 0 0 chr22 42090806 1 42 42
Run Code Online (Sandbox Code Playgroud)
其他一些替代方案:
$ grep -oP '^\s*\S+\s*\K.*' file
1 chr22 42090593 0 1 chr22 42090609 1 42 42
1 chr22 42090593 0 1 chr22 42090654 1 42 42
1 chr22 42090595 0 1 chr22 42090633 1 42 42
0 chr22 42090612 0 1 chr22 42090656 1 42 42
0 chr22 42090614 0 0 chr22 42090617 1 40 42
0 chr22 42090647 0 1 chr22 42090749 1 42 42
1 chr22 42090684 0 1 chr22 42090692 1 42 42
1 chr22 42090733 0 1 chr22 42090743 1 42 42
1 chr22 42090733 0 1 chr22 42090775 1 42 42
1 chr22 42090733 0 1 chr22 42090789 1 42 42
1 chr22 42090757 0 1 chr22 42090787 1 42 24
0 chr22 42090778 0 0 chr22 42090790 1 42 42
0 chr22 42090800 0 0 chr22 42090802 1 42 42
0 chr22 42090803 0 0 chr22 42090806 1 42 42
Run Code Online (Sandbox Code Playgroud)
或者
$ perl -pe 's/^\s*\S+\s*//' file
1 chr22 42090593 0 1 chr22 42090609 1 42 42
1 chr22 42090593 0 1 chr22 42090654 1 42 42
1 chr22 42090595 0 1 chr22 42090633 1 42 42
0 chr22 42090612 0 1 chr22 42090656 1 42 42
0 chr22 42090614 0 0 chr22 42090617 1 40 42
0 chr22 42090647 0 1 chr22 42090749 1 42 42
1 chr22 42090684 0 1 chr22 42090692 1 42 42
1 chr22 42090733 0 1 chr22 42090743 1 42 42
1 chr22 42090733 0 1 chr22 42090775 1 42 42
1 chr22 42090733 0 1 chr22 42090789 1 42 42
1 chr22 42090757 0 1 chr22 42090787 1 42 24
0 chr22 42090778 0 0 chr22 42090790 1 42 42
0 chr22 42090800 0 0 chr22 42090802 1 42 42
0 chr22 42090803 0 0 chr22 42090806 1 42 42
Run Code Online (Sandbox Code Playgroud)
或者
$ perl -F'\t' -lane 'print join "\t",@F[1..$#F]' file
1 chr22 42090593 0 1 chr22 42090609 1 42 42
1 chr22 42090593 0 1 chr22 42090654 1 42 42
1 chr22 42090595 0 1 chr22 42090633 1 42 42
0 chr22 42090612 0 1 chr22 42090656 1 42 42
0 chr22 42090614 0 0 chr22 42090617 1 40 42
0 chr22 42090647 0 1 chr22 42090749 1 42 42
1 chr22 42090684 0 1 chr22 42090692 1 42 42
1 chr22 42090733 0 1 chr22 42090743 1 42 42
1 chr22 42090733 0 1 chr22 42090775 1 42 42
1 chr22 42090733 0 1 chr22 42090789 1 42 42
1 chr22 42090757 0 1 chr22 42090787 1 42 24
0 chr22 42090778 0 0 chr22 42090790 1 42 42
0 chr22 42090800 0 0 chr22 42090802 1 42 42
0 chr22 42090803 0 0 chr22 42090806 1 42 42
Run Code Online (Sandbox Code Playgroud)