如何在不改变格式的情况下从文件中删除列?

dg7*_*g72 1 sed awk text-processing regular-expression

我需要从这样的文件中删除第一列:

165 1   chr22   42090593    0   1   chr22   42090609    1   42  42
166 1   chr22   42090593    0   1   chr22   42090654    1   42  42
167 1   chr22   42090595    0   1   chr22   42090633    1   42  42
168 0   chr22   42090612    0   1   chr22   42090656    1   42  42
169 0   chr22   42090614    0   0   chr22   42090617    1   40  42
170 0   chr22   42090647    0   1   chr22   42090749    1   42  42
171 1   chr22   42090684    0   1   chr22   42090692    1   42  42
172 1   chr22   42090733    0   1   chr22   42090743    1   42  42
173 1   chr22   42090733    0   1   chr22   42090775    1   42  42
174 1   chr22   42090733    0   1   chr22   42090789    1   42  42
175 1   chr22   42090757    0   1   chr22   42090787    1   42  24
176 0   chr22   42090778    0   0   chr22   42090790    1   42  42
177 0   chr22   42090800    0   0   chr22   42090802    1   42  42
178 0   chr22   42090803    0   0   chr22   42090806    1   42  42
Run Code Online (Sandbox Code Playgroud)

命令

awk '{$1=""; print $0}'
Run Code Online (Sandbox Code Playgroud)

正确删除第一列但以这种方式改变格式

1 chr22 51178322 0 0 chr22 51178659 1 42 42
0 chr22 51178661 0 0 chr22 51178663 1 42 42
0 chr22 51178667 0 1 chr22 51178790 1 42 23
1 chr22 51178755 0 0 chr22 51178764 1 42 42
0 chr22 51178808 0 1 chr22 51178871 1 42 42
1 chr22 51178869 0 1 chr22 51178895 1 42 42
1 chr22 51178881 0 1 chr22 51178893 1 42 42
1 chr22 51178881 0 1 chr22 51178895 1 42 42
1 chr22 51179213 0 1 chr22 51179213 1 42 42
1 chr22 51180087 0 1 chr22 51180093 1 42 42
1 chr22 51180134 0 0 chr22 51181889 1 42 42
0 chr22 51186192 0 0 chr22 51186192 1 42 42
0 chr22 51186192 0 0 chr22 51186192 1 42 42
Run Code Online (Sandbox Code Playgroud)

任何的想法?

ter*_*don 5

你的方法有两个问题。首先,它看起来像一个制表符分隔的文件,并且您没有告诉 awk 使用制表符。其次,当您""在 awk 中设置一个字段时,您不是在删除该字段,而是在清空它。所以它仍然被打印出来,这就是为什么在输出的每一行的开头都有一个额外的空格。

所以,如果你想在 awk 中做到这一点,你需要这样的东西(假设你的例子中的前导空格实际上不是你的文件的一部分):

$ awk -F"\t" 'BEGIN{OFS="\t"}{for(i=2;i<NF;i++){printf "%s%s",$i,OFS} print $NF}' file 
1   chr22   42090593    0   1   chr22   42090609    1   42  42
1   chr22   42090593    0   1   chr22   42090654    1   42  42
1   chr22   42090595    0   1   chr22   42090633    1   42  42
0   chr22   42090612    0   1   chr22   42090656    1   42  42
0   chr22   42090614    0   0   chr22   42090617    1   40  42
0   chr22   42090647    0   1   chr22   42090749    1   42  42
1   chr22   42090684    0   1   chr22   42090692    1   42  42
1   chr22   42090733    0   1   chr22   42090743    1   42  42
1   chr22   42090733    0   1   chr22   42090775    1   42  42
1   chr22   42090733    0   1   chr22   42090789    1   42  42
1   chr22   42090757    0   1   chr22   42090787    1   42  24
0   chr22   42090778    0   0   chr22   42090790    1   42  42
0   chr22   42090800    0   0   chr22   42090802    1   42  42
0   chr22   42090803    0   0   chr22   42090806    1   42  42
Run Code Online (Sandbox Code Playgroud)

但是其他工具,就像cut 已经提到的那样在这里更简单。如果您的文件是制表符分隔的,您可以这样做:

$ cut -f2- file 
1   chr22   42090593    0   1   chr22   42090609    1   42  42
1   chr22   42090593    0   1   chr22   42090654    1   42  42
1   chr22   42090595    0   1   chr22   42090633    1   42  42
0   chr22   42090612    0   1   chr22   42090656    1   42  42
0   chr22   42090614    0   0   chr22   42090617    1   40  42
0   chr22   42090647    0   1   chr22   42090749    1   42  42
1   chr22   42090684    0   1   chr22   42090692    1   42  42
1   chr22   42090733    0   1   chr22   42090743    1   42  42
1   chr22   42090733    0   1   chr22   42090775    1   42  42
1   chr22   42090733    0   1   chr22   42090789    1   42  42
1   chr22   42090757    0   1   chr22   42090787    1   42  24
0   chr22   42090778    0   0   chr22   42090790    1   42  42
0   chr22   42090800    0   0   chr22   42090802    1   42  42
0   chr22   42090803    0   0   chr22   42090806    1   42  42
Run Code Online (Sandbox Code Playgroud)

其他一些替代方案:

$ grep -oP '^\s*\S+\s*\K.*' file 
1   chr22   42090593    0   1   chr22   42090609    1   42  42
1   chr22   42090593    0   1   chr22   42090654    1   42  42
1   chr22   42090595    0   1   chr22   42090633    1   42  42
0   chr22   42090612    0   1   chr22   42090656    1   42  42
0   chr22   42090614    0   0   chr22   42090617    1   40  42
0   chr22   42090647    0   1   chr22   42090749    1   42  42
1   chr22   42090684    0   1   chr22   42090692    1   42  42
1   chr22   42090733    0   1   chr22   42090743    1   42  42
1   chr22   42090733    0   1   chr22   42090775    1   42  42
1   chr22   42090733    0   1   chr22   42090789    1   42  42
1   chr22   42090757    0   1   chr22   42090787    1   42  24
0   chr22   42090778    0   0   chr22   42090790    1   42  42
0   chr22   42090800    0   0   chr22   42090802    1   42  42
0   chr22   42090803    0   0   chr22   42090806    1   42  42
Run Code Online (Sandbox Code Playgroud)

或者

$ perl -pe 's/^\s*\S+\s*//' file 
1   chr22   42090593    0   1   chr22   42090609    1   42  42
1   chr22   42090593    0   1   chr22   42090654    1   42  42
1   chr22   42090595    0   1   chr22   42090633    1   42  42
0   chr22   42090612    0   1   chr22   42090656    1   42  42
0   chr22   42090614    0   0   chr22   42090617    1   40  42
0   chr22   42090647    0   1   chr22   42090749    1   42  42
1   chr22   42090684    0   1   chr22   42090692    1   42  42
1   chr22   42090733    0   1   chr22   42090743    1   42  42
1   chr22   42090733    0   1   chr22   42090775    1   42  42
1   chr22   42090733    0   1   chr22   42090789    1   42  42
1   chr22   42090757    0   1   chr22   42090787    1   42  24
0   chr22   42090778    0   0   chr22   42090790    1   42  42
0   chr22   42090800    0   0   chr22   42090802    1   42  42
0   chr22   42090803    0   0   chr22   42090806    1   42  42
Run Code Online (Sandbox Code Playgroud)

或者

$ perl -F'\t' -lane 'print join "\t",@F[1..$#F]' file 
1   chr22   42090593    0   1   chr22   42090609    1   42  42
1   chr22   42090593    0   1   chr22   42090654    1   42  42
1   chr22   42090595    0   1   chr22   42090633    1   42  42
0   chr22   42090612    0   1   chr22   42090656    1   42  42
0   chr22   42090614    0   0   chr22   42090617    1   40  42
0   chr22   42090647    0   1   chr22   42090749    1   42  42
1   chr22   42090684    0   1   chr22   42090692    1   42  42
1   chr22   42090733    0   1   chr22   42090743    1   42  42
1   chr22   42090733    0   1   chr22   42090775    1   42  42
1   chr22   42090733    0   1   chr22   42090789    1   42  42
1   chr22   42090757    0   1   chr22   42090787    1   42  24
0   chr22   42090778    0   0   chr22   42090790    1   42  42
0   chr22   42090800    0   0   chr22   42090802    1   42  42
0   chr22   42090803    0   0   chr22   42090806    1   42  42
Run Code Online (Sandbox Code Playgroud)