如何根据行模式的开头将文本文件拆分为多个文件？

Question

如何根据行模式的开头将文本文件拆分为多个文件？

Dyl*_*ney 2 shell awk text-processing macos

我有一些文本文件，我想根据我放在各行开头的任意“标签”将它们分成不同的文件。

示例文本文件：

I CELEBRATE myself, and sing myself,  
And what I assume you shall assume, 
For every atom belonging to me as good belongs to you.

#here I loafe and invite my soul, 
#here I lean and loafe at my ease observing a spear of summer grass.

#there My tongue, every atom of my blood, form'd from this soil, this air,
#there Born here of parents born here from parents the same, and their parents the same, 
#here I, now thirty-seven years old in perfect health begin, 
#here Hoping to cease not till death.

Run Code Online (Sandbox Code Playgroud)

在此示例中，我想删除以开头的每一行#here并将其附加到名为的文件中here.txt，以开头的每一行都附加到名为#there的文件中there.txt，并将所有未标记的行保留在原始文件中。（理想情况下删除#here #there过程中的标签。）

我认为这个解决方案使用awk可能会有所帮助，但我是一个 Unix 菜鸟，我不知道如何适应我的问题：How to split a file by using keyword boundary

有关如何进行的任何建议？

PS：我在 OS X 上使用命令行。

Answer 1

ste*_*ver 5

您的案例比链接案例更简单 - 您只需要查看每一行（或用 awk 的说法“记录”）并决定将其发送到何处。所以：

awk '/^#here/{print > "here.txt"; next} /^#there/{print > "there.txt"; next} {print}' input.txt

Run Code Online (Sandbox Code Playgroud)

剩余的行将打印到标准输出；可移植地，您可以将其重定向到第三个文件（rest.txt例如），然后将其重命名为原始文件的名称。如果你有GNU awk，可以直接使用inplace模块修改原文件：

gawk -i inplace '/^#here/{print > "here.txt"; next} /^#there/{print > "there.txt"; next} {print}' input.txt

Run Code Online (Sandbox Code Playgroud)

归档时间：	4 年，6 月前
查看次数：	392 次
最近记录：	4 年，6 月前