BASH:计算相同的行

Sea*_*ton 4 bash awk sed duplicates line-count

我有一个文件,其中包含:

VoicemailButtonTest
VoicemailButtonTest
VoicemailButtonTest
VoicemailButtonTest
VoicemailButtonTest
VoiceMailConfig60CharsTest
VoicemailDefaultTypeTest
VoiceMailIconSelectableTest
VoiceMailIconSelectableTest
VoiceMailIconSelectableTest
VoiceMailIconSelectableTest
VoiceMailIconSelectableTest
VoicemailSettingsFromMessageModeScreenTest
VoicemailSettingsFromMessageModeScreenTest
VoicemailSettingsTest
VoicemailSettingsTest
VoicemailSettingsTest
VoicemailSettingsTest
VoicemailSettingsTest
VoicemailSettingsTest
VoicemailSettingsTest
Run Code Online (Sandbox Code Playgroud)

如何用计数替换重复行:

VoicemailButtonTest (5)
VoiceMailConfig60CharsTest (1)
VoicemailDefaultTypeTest (1)
VoiceMailIconSelectableTest (5)
VoicemailSettingsFromMessageModeScreenTest (2)
VoicemailSettingsTest (7)
Run Code Online (Sandbox Code Playgroud)

我将这对放入关联数组中。我尝试在“while”语句中使用“read”,但数组丢失了。这是我的尝试:

unset line
tests=$(cat file.log)
echo "$tests" | 
    while read l; do 
        if [ "$l" == "${line}" ]; then
            let cnt++;
        else
            echo "${line} (${cnt})"
            line=${l}
            cnt=1
        fi
        export run_suites
    done
Run Code Online (Sandbox Code Playgroud)

小智 9

假设输出的格式不必完全匹配

VoicemailButtonTest (5)
VoiceMailConfig60CharsTest (1)
VoicemailDefaultTypeTest (1)
VoiceMailIconSelectableTest (5)
VoicemailSettingsFromMessageModeScreenTest (2)
VoicemailSettingsTest (7)
Run Code Online (Sandbox Code Playgroud)

你可以使用

sort <input_file> | uniq -c
Run Code Online (Sandbox Code Playgroud)

如果您需要输出与您发布的内容完全匹配,您可以使用

awk '{duplicates[$1]++} END{for (ind in duplicates) {print ind,"("duplicates[ind]")"}}' <input_file>
Run Code Online (Sandbox Code Playgroud)

编辑:在anubhava的回答之后发布...但由于添加了排序命令而离开(除非人们建议我删除)。