如何根据行模式的开头将文本文件拆分为多个文件?

Dyl*_*ney 2 shell awk text-processing macos

我有一些文本文件,我想根据我放在各行开头的任意“标签”将它们分成不同的文件。

示例文本文件:

I CELEBRATE myself, and sing myself,  
And what I assume you shall assume, 
For every atom belonging to me as good belongs to you.

#here I loafe and invite my soul, 
#here I lean and loafe at my ease observing a spear of summer grass.

#there My tongue, every atom of my blood, form'd from this soil, this air,
#there Born here of parents born here from parents the same, and their parents the same, 
#here I, now thirty-seven years old in perfect health begin, 
#here Hoping to cease not till death.
Run Code Online (Sandbox Code Playgroud)

在此示例中,我想删除以 开头的每一行#here并将其附加到名为 的文件中here.txt,以 开头的每一行都附加到名为#there的文件中there.txt,并将所有未标记的行保留在原始文件中。(理想情况下删除#here #there过程中的标签。)

我认为这个解决方案使用awk可能会有所帮助,但我是一个 Unix 菜鸟,我不知道如何适应我的问题:How to split a file by using keyword boundary

有关如何进行的任何建议?

PS:我在 OS X 上使用命令行。

ste*_*ver 5

您的案例比链接案例更简单 - 您只需要查看每一行(或用 awk 的说法“记录”)并决定将其发送到何处。所以:

awk '/^#here/{print > "here.txt"; next} /^#there/{print > "there.txt"; next} {print}' input.txt
Run Code Online (Sandbox Code Playgroud)

剩余的行将打印到标准输出;可移植地,您可以将其重定向到第三个文件(rest.txt例如),然后将其重命名为原始文件的名称。如果你有GNU awk,可以直接使用inplace模块修改原文件:

gawk -i inplace '/^#here/{print > "here.txt"; next} /^#there/{print > "there.txt"; next} {print}' input.txt
Run Code Online (Sandbox Code Playgroud)