使用 Bash 从文本文件中提取标记的字符串

Question

使用 Bash 从文本文件中提取标记的字符串

我有以下样式的文件 - 这些是参数化的配置文件；#根据环境，字符中的值被替换为来自数据库的真实值。

ABC=#PARAMETER_1#:#PARAMETER_2#
SOMETHING_ELSE=#PARAMETER_1#
SOMETHING_NEW=#PARAMETER_2##PARAMETER_3#

Run Code Online (Sandbox Code Playgroud)

我想从这些文件中提取哈希/磅 ( #) 字符之间的值，以便我可以轻松识别所需的参数。没有标准的列宽或类似的东西，唯一的标准是两个#字符之间的任何东西都被数据库中的值替换。

这是理想的清理、重复数据删除输出：

PARAMETER_1
PARAMETER_2
PARAMETER_3

Run Code Online (Sandbox Code Playgroud)

我见过这个问题，但关键的区别在于，在我的情况下，特定行上可以有任意数量的变量。

我已经用 Bash 标记了这个问题，但它不一定是，它可以是 perl 等，它只需要从 Unix 中的命令行运行。

Answer 1

man*_*ork 5

作为第一个想法，awk：

awk -vRS='#[^#]+#' 'RT{gsub(/#/,"",RT);p[RT]=1}END{for(i in p)print i}' the_file

Run Code Online (Sandbox Code Playgroud)

但此决定可能取决于您必须执行的其他操作。

评论中要求的解释。

awk -vRS='#[^#]+#' ' # use /#[^#]+#/ as record separator RT { # record terminator not empty? gsub(/#/,"",RT) # remove the # parameter delimiter markup p[RT]=1 # store it as key in array p } END { # end of input? for (i in p) print i # loop through array p and print each key }' the_file
Run Code Online (Sandbox Code Playgroud)
基本部分是使用RT（记录终止符）内置变量：

RT The record terminator. Gawk sets RT to the input text that matched the character or regular expression specified by RS.
Run Code Online (Sandbox Code Playgroud)

归档时间：	13 年，6 月前
查看次数：	1110 次
最近记录：	13 年，6 月前