1 unix bash shell awk duplicates
这是我 TWS 数据库中工作的摘录,我的块以:
/^ES2BVE1011 # EM5341CAI000 (jobname)
Run Code Online (Sandbox Code Playgroud)
并以:
/^ RECOVERY (can be STOP ou CONTINUE)
Run Code Online (Sandbox Code Playgroud)
我有重复的块,我只想保留第一个以最大限度地减少加载时间,前提是整个块都具有相同的行,因为它可以是相同的作业名称,但块中的其他行可能存在差异:
ES2BVE1011 # EM5341CAI000
SCRIPTNAME "/s2ipgm/scripts/current/em5341cai000.sh -scai -eexp"
STREAMLOGON us2icai
DESCRIPTION "balance sheet errors"
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
ES2BVE1011 # ED5237CAI001
SCRIPTNAME "/s2ipgm/scripts/current/ed5237com001.sh -scai -eexp"
STREAMLOGON us2icai
DESCRIPTION "bb / ir account list"
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
ES2BVE1011 # CA4305CAI000
SCRIPTNAME "/s2ipgm/scripts/current/ea4305com000.sh -scai -ecpt"
STREAMLOGON us2icai
DESCRIPTION "list op. Fid."
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
ES2BVE1011 # CM4622CAI000
SCRIPTNAME "/s2ipgm/scripts/current/em4622com000.sh -scai -ecpt"
STREAMLOGON us2icai
DESCRIPTION "list of debits covered / not c"
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
ES2BVE1011 # ED5237CAI001
SCRIPTNAME "/s2ipgm/scripts/current/ed5237com001.sh -scai -eexp"
STREAMLOGON us2icai
DESCRIPTION "bb / ir account list"
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
ES2BVE1011 # CJ5326CAI000
SCRIPTNAME "/s2ipgm/scripts/current/ej5326cai000.sh -scai -ecpt"
STREAMLOGON us2icai
DESCRIPTION "daily report"
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
ES2BVE1011 # CA4305CAI000
SCRIPTNAME "/s2ipgm/scripts/current/ea4305com000.sh -scai -ecpt"
STREAMLOGON us2icai
DESCRIPTION "list op. Fid."
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
ES2BVE1011 # ED5237CAI001
SCRIPTNAME "/usr/bin/true"
STREAMLOGON us2ipgm
DESCRIPTION "bb / ir account list"
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
Run Code Online (Sandbox Code Playgroud)
$ cat tst.awk
{ block = block $0 ORS }
/^ RECOVERY/ {
if ( !seen[block]++ ) {
printf "%s", block
}
block = ""
}
Run Code Online (Sandbox Code Playgroud)
.
$ awk -f tst.awk file
ES2BVE1011 # EM5341CAI000
SCRIPTNAME "/s2ipgm/scripts/current/em5341cai000.sh -scai -eexp"
STREAMLOGON us2icai
DESCRIPTION "balance sheet errors"
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
ES2BVE1011 # ED5237CAI001
SCRIPTNAME "/s2ipgm/scripts/current/ed5237com001.sh -scai -eexp"
STREAMLOGON us2icai
DESCRIPTION "bb / ir account list"
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
ES2BVE1011 # CA4305CAI000
SCRIPTNAME "/s2ipgm/scripts/current/ea4305com000.sh -scai -ecpt"
STREAMLOGON us2icai
DESCRIPTION "list op. Fid."
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
ES2BVE1011 # CM4622CAI000
SCRIPTNAME "/s2ipgm/scripts/current/em4622com000.sh -scai -ecpt"
STREAMLOGON us2icai
DESCRIPTION "list of debits covered / not c"
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
ES2BVE1011 # CJ5326CAI000
SCRIPTNAME "/s2ipgm/scripts/current/ej5326cai000.sh -scai -ecpt"
STREAMLOGON us2icai
DESCRIPTION "daily report"
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
ES2BVE1011 # ED5237CAI001
SCRIPTNAME "/usr/bin/true"
STREAMLOGON us2ipgm
DESCRIPTION "bb / ir account list"
UNIX TASKTYPE
SUCCOUTPUTCOND CONDSUCC "(RC = 0)"
RECOVERY STOP
Run Code Online (Sandbox Code Playgroud)