如何将内容替换为多个文件?

Men*_*del 4 command-line bash perl sed text-processing

我有多个包含如下内容的文件:

File 1

NC_12548  og789 |nd784  -2 -54 -6

NC_12548  og789 |nd784  -2 -54 -6

NC_12548  og789 |nd784  -2 -54 -6

File2

NC_54456  og789 |nd784  -5 -56 -6

NC_98123  og859 |nd784  -5 -84 -5

NC_689.1  og456 |nd784  -5 -54 +8

File3

NC_54456  og789 |nd784  -5 -56 -6

NC_98123  og859 |nd784  -5 -84 -5

NC_689.1  og456 |nd784  -5 -54 +8
Run Code Online (Sandbox Code Playgroud)

我想保留仅有的前两列 (NC_12345 og855) 并丢弃其余的列。我怎样才能做到这一点?

Ser*_*nyy 8

有了awk你可以使用|作为列分隔符和打印的第一列:

awk -F '|' '{print $1}' file1.txt file2.txt file3.txt
Run Code Online (Sandbox Code Playgroud)

输出将被连接。如果需要将输出保存在单独的文件中,请考虑在 shell 中使用 for 循环awk

# assuming they're all in the same directory,  hence `*`
for fname in ./file*.txt ; do
    # add extension to current file in "$fname" variable indicate new file
    # > does the actual redirection
    awk -F '|' '{print $1}'  "$fname" > "$fname".new
done
Run Code Online (Sandbox Code Playgroud)

.new备份可能需要有新的输出。否则,我们可以使用sed -i来执行文件内替换。无需-i先运行即可进行测试

# use file*.txt if they're all in the current directory
sed -i 's/|.*$//' file1.txt file2.txt file3.txt
sed -i 's/\(^.*\)|.*/\1/g' file1.txt file2.txt file3.txt
Run Code Online (Sandbox Code Playgroud)

另一种选择是通过 Python:

#!/usr/bin/env python3
import sys

for fname in sys.argv:
    with open(fname) as fd_read, open(fname+'.new','w') as fd_write:
        for line in fd_read:
            fd_write.write(line.split('|')[0] + '\n')
Run Code Online (Sandbox Code Playgroud)

此脚本旨在用作./script.py file1.txt file2.txt file3.txt并将输出写入具有.new扩展名的新文件