Sea*_*ton 4 bash awk sed duplicates line-count
我有一个文件,其中包含:
VoicemailButtonTest
VoicemailButtonTest
VoicemailButtonTest
VoicemailButtonTest
VoicemailButtonTest
VoiceMailConfig60CharsTest
VoicemailDefaultTypeTest
VoiceMailIconSelectableTest
VoiceMailIconSelectableTest
VoiceMailIconSelectableTest
VoiceMailIconSelectableTest
VoiceMailIconSelectableTest
VoicemailSettingsFromMessageModeScreenTest
VoicemailSettingsFromMessageModeScreenTest
VoicemailSettingsTest
VoicemailSettingsTest
VoicemailSettingsTest
VoicemailSettingsTest
VoicemailSettingsTest
VoicemailSettingsTest
VoicemailSettingsTest
Run Code Online (Sandbox Code Playgroud)
如何用计数替换重复行:
VoicemailButtonTest (5)
VoiceMailConfig60CharsTest (1)
VoicemailDefaultTypeTest (1)
VoiceMailIconSelectableTest (5)
VoicemailSettingsFromMessageModeScreenTest (2)
VoicemailSettingsTest (7)
Run Code Online (Sandbox Code Playgroud)
我将这对放入关联数组中。我尝试在“while”语句中使用“read”,但数组丢失了。这是我的尝试:
unset line
tests=$(cat file.log)
echo "$tests" |
while read l; do
if [ "$l" == "${line}" ]; then
let cnt++;
else
echo "${line} (${cnt})"
line=${l}
cnt=1
fi
export run_suites
done
Run Code Online (Sandbox Code Playgroud)
小智 9
假设输出的格式不必完全匹配
VoicemailButtonTest (5)
VoiceMailConfig60CharsTest (1)
VoicemailDefaultTypeTest (1)
VoiceMailIconSelectableTest (5)
VoicemailSettingsFromMessageModeScreenTest (2)
VoicemailSettingsTest (7)
Run Code Online (Sandbox Code Playgroud)
你可以使用
sort <input_file> | uniq -c
Run Code Online (Sandbox Code Playgroud)
如果您需要输出与您发布的内容完全匹配,您可以使用
awk '{duplicates[$1]++} END{for (ind in duplicates) {print ind,"("duplicates[ind]")"}}' <input_file>
Run Code Online (Sandbox Code Playgroud)
编辑:在anubhava的回答之后发布...但由于添加了排序命令而离开(除非人们建议我删除)。