在 Unix 中使用时间戳将单行拆分为多行

Question

在 Unix 中使用时间戳将单行拆分为多行

给定一个没有固定模式的字段分隔符的输入行，如下所示。

x="15:23:46 Let's do this 15:23:47 It's easy: to do   for    you 15:23:48 You will ## have solution soon   0"

Run Code Online (Sandbox Code Playgroud)

我试图根据时间戳模式在不同的行上打破它，所以预期的输出如下。

15:23:46 Let's do this
15:23:47 It's easy: to do for you
15:23:48 You will have solution soon
0

Run Code Online (Sandbox Code Playgroud)

请注意，行尾有 0 并且也应该打印在换行符上。我需要将它用作其余代码的返回状态。

当时间戳不同时，我能够实现结果，但是当其中一些相同时，就会导致意外输出。

x=" 15:23:46让我们这样做15:23:46很容易：为你做 15:23:48 你很快就会## 有解决方案 0"

请注意，现在我们有两个相同的时间戳。这就是我被困的地方。预期的输出应该是：

15:23:46 Let's do this
15:23:46 It's easy: to do for you
15:23:48 You will have solution soon
0

Run Code Online (Sandbox Code Playgroud)

我使用的逻辑是获取数组中的所有时间戳，然后迭代所需数据的时间戳和 grep 数量。具有唯一时间戳时对我有用的逻辑如下。

15:23:46 Let's do this
15:23:47 It's easy: to do for you
15:23:48 You will have solution soon
0

Run Code Online (Sandbox Code Playgroud)

当一行中的某些或许多地方的时间戳相同时，有人可以帮助我吗，或者将我重定向到已解决类似问题的地方？

Answer 1

Ed *_*ton 5

使用用于多字符 RS 和 RT 的 GNU awk：

$ awk -v RS='([0-9]{2}(:[0-9]{2}){2})|(0\n$)' 'NR>1{print pRT $0} {pRT=RT} END{printf "%s", RT}' <<<"$x"
15:23:46 Let's do this
15:23:47 It's easy: to do   for    you
15:23:48 You will ## have solution soon
0

Run Code Online (Sandbox Code Playgroud)

或者如果您的 shell 没有<<<运算符：

$ echo "$x" | awk -v RS='([0-9]{2}(:[0-9]{2}){2})|(0\n$)' 'NR>1{print pRT $0} {pRT=RT} END{printf "%s", RT}'
15:23:46 Let's do this
15:23:47 It's easy: to do   for    you
15:23:48 You will ## have solution soon
0

Run Code Online (Sandbox Code Playgroud)

如果您想从输出行中去除尾随空白，只需更改print pRT $0为print pRT gensub(/\s+$/,"",1,$0).

归档时间：	4 年，10 月前
查看次数：	254 次
最近记录：	4 年，10 月前