列出文件中使用的每个字符的实用方法是什么（Bash）（Regex）

Question

列出文件中使用的每个字符的实用方法是什么（Bash）（Regex）

我怎样才能把这个：

Johnny's penguin, (Tuxie), likes the following foods: French fries, and beef.

Run Code Online (Sandbox Code Playgroud)

对此：

 abcdefghiklnoprstuwFJT',():.

Run Code Online (Sandbox Code Playgroud)

（这些是输入中使用的总字符数）

请注意小写字符“jmqvz”不在输入语句中，因此不输出。

顺序并不重要，但小写，然后大写，然后是特殊字符将是首选。

我确定我需要 sed/awk/etc。为此，但经过广泛搜索，我没有发现任何类似的东西。

Answer 1

mur*_*uru 14

您可以使用组合sed和sort：

$ echo "Johnny's penguin, (Tuxie), likes the following foods: French fries, and beef." | 
>  sed 's/./&\n/g' | LC_COLLATE=C sort -u | tr -d '\n'
 '(),.:FJTabcdefghiklnoprstuwxy

Run Code Online (Sandbox Code Playgroud)

sort进行字典排序，所以看看man 7 ascii字符将如何排序。

解释：

sed 's/./&\n/g'- 在每个字符后添加一个换行符，因为sort（通常）进行逐行排序
LC_COLLATE=C将整理样式设置为C（请参阅“LC_ALL=C”做什么？）
sort -u: 对输入进行排序并仅打印唯一条目
tr -d '\n' 删除所有额外的新行。

如果只想保留可见字符：

$ echo "Johnny's penguin, (Tuxie), likes the following foods: French fries, and beef." | 
> tr -cd '[[:graph:]]' | sed 's/./&\n/g' | LC_COLLATE=C sort -u | tr -d '\n'

Run Code Online (Sandbox Code Playgroud)

tr -cd '[[:graph:]]' 删除除可见字符以外的所有内容。

Answer 2

Nyk*_*kin 9

您可以使用将文件的每个字符打印在单独的行中fold -w1，然后对输出进行排序并使用sort -u(或sort | uniq)消除重复项：

$ cat test 
Johnny's penguin, (Tuxie), likes the following foods: French fries, and beef.
$ fold -w1 test | sort -u

,
:
.
'
(
)
a
b
c
d
e
f
F
g
h
i
J
k
l
n
o
p
r
s
t
T
u
w
x
y

Run Code Online (Sandbox Code Playgroud)

然后你可以再次把它变成一行，例如paste -sd "" -：

$ fold -w1 test | sort -u | paste -sd "" -
 ,:.'()abcdefFghiJklnoprstTuwxy

Run Code Online (Sandbox Code Playgroud)

归档时间：	10 年，7 月前
查看次数：	4678 次
最近记录：	10 年，6 月前