为什么uniq -c输出空格而不是\ t？

Question

我使用uniq -c一些文本文件.它的输出如下:

123(space)first word(tab)other things
  2(space)second word(tab)other things

....

所以我需要提取总数(如上面的123和2),但我无法弄清楚如何,因为如果我按空格分割这一行,它会喜欢这个['123', 'first', 'word(tab)other', 'things'].我想知道它为什么不用标签输出？

以及如何提取shell中的总数？(我终于用python,WTF提取它)

更新:对不起,我没有正确描述我的问题.我不想总和总数,我只想用(制表符)替换(空格),但它不影响单词中的空格,因为我之后仍然需要数据.像这样:

123(tab)first word(tab)other things
  2(tab)second word(tab)other things

Answer 1

试试这个:

uniq -c | sed -r 's/^( *[^ ]+) +/\1\t/'

Answer 2

尝试:

uniq -c text.file | sed -e 's/ *//' -e 's/ /\t/'

这将删除行计数之前的空格,然后仅使用制表符替换第一个空格.

要使用制表符替换所有空格,请使用tr:

uniq -c text.file | tr ' ' '\t'

要使用单个选项卡替换所有连续运行的选项卡,请使用-s:

uniq -c text.file | tr -s ' ' '\t'