Lan*_*nes 33 awk text-processing
echo -e 'one two three\nfour five six\nseven eight nine'
one two three
four five six
seven eight nine
Run Code Online (Sandbox Code Playgroud)
我怎样才能做一些“魔法”才能得到这个输出?:
three
six
nine
Run Code Online (Sandbox Code Playgroud)
更新:我不需要这种特定方式,我需要一个通用解决方案,以便无论一行中有多少列,例如:awk 始终显示最后一列。
Sea*_* C. 66
尝试:
echo -e 'one two three\nfour five six\nseven eight nine' | awk '{print $NF}'
Run Code Online (Sandbox Code Playgroud)
bah*_*mat 19
这比你想象的要容易。
$ echo one two three | awk '{print $NF}'
three
Run Code Online (Sandbox Code Playgroud)
ken*_*orb 15
尝试grep(更短/更简单,但比awk使用正则表达式慢 3 倍):
grep -o '\S\+$' <(echo -e '... seven eight nine')
Run Code Online (Sandbox Code Playgroud)
或者ex(甚至更慢,但它在完成后打印整个缓冲区,在需要就地排序或编辑时更有用):
ex -s +'%s/^.*\s//g' -c'%p|q!' <(echo -e '... seven eight nine')
ex +'%norm $Bd0' -sc'%p|q!' infile
Run Code Online (Sandbox Code Playgroud)
要就地更改,请替换-sc'%p|q!'为-scwq。
或者bash:
while read line; do arr=($line); echo ${arr[-1]}; done < someinput
Run Code Online (Sandbox Code Playgroud)
鉴于通过以下方式生成的 1GB 文件:
$ hexdump -C /dev/urandom | rev | head -c1G | pv > datafile
Run Code Online (Sandbox Code Playgroud)
我已经执行了解析时间统计(运行 ~3x 并取最低,在 MBP OS X 上测试):
使用awk:
$ time awk '{print $NF}' datafile > /dev/null
real 0m12.124s
user 0m10.704s
sys 0m0.709s
Run Code Online (Sandbox Code Playgroud)使用grep:
$ time grep -o '\S\+$' datafile > /dev/null
real 0m36.731s
user 0m36.244s
sys 0m0.401s
$ time grep -o '\S*$' datafile > /dev/null
real 0m40.865s
user 0m39.756s
sys 0m0.415s
Run Code Online (Sandbox Code Playgroud)使用perl:
$ time perl -lane 'print $F[-1]' datafile > /dev/null
real 0m48.292s
user 0m47.601s
sys 0m0.396s
Run Code Online (Sandbox Code Playgroud)使用rev+ cut:
$ time (rev|cut -d' ' -f1|rev) < datafile > /dev/null
$ time rev datafile | cut -d' ' -f1 | rev > /dev/null
real 1m10.342s
user 1m19.940s
sys 0m1.263s
Run Code Online (Sandbox Code Playgroud)使用ex:
$ time ex +'%norm $Bd0_' -sc'%p|q!' datafile > /dev/null
real 3m47.332s
user 3m42.037s
sys 0m2.617s
$ time ex +'%norm $Bd0' -sc'%p|q!' datafile > /dev/null
real 4m1.527s
user 3m44.219s
sys 0m6.164s
$ time ex +'%s/^.*\s//g' -sc'%p|q!' datafile > /dev/null
real 4m16.717s
user 4m5.334s
sys 0m5.076s
Run Code Online (Sandbox Code Playgroud)使用bash:
$ time while read line; do arr=($line); echo ${arr[-1]}; done < datafile > /dev/null
real 9m42.807s
user 8m12.553s
sys 1m1.955s
Run Code Online (Sandbox Code Playgroud)它甚至可以只完成'bash',无'sed','awk'或'perl':
echo -e 'one two three\nfour five six\nseven eight nine' |
while IFS=" " read -r -a line; do
nb=${#line[@]}
echo ${line[$((nb - 1))]}
done
Run Code Online (Sandbox Code Playgroud)
也可以使用'sed':
echo -e 'one two three\nfour five six\nseven eight nine' | sed -e 's/^.* \([^ ]*\)$/\1/'
Run Code Online (Sandbox Code Playgroud)
更新:
或更简单地说:
echo -e 'one two three\nfour five six\nseven eight nine' | sed -e 's/^.* //'
Run Code Online (Sandbox Code Playgroud)
或使用cut:
echo -e 'one two three\nfour five six\nseven eight nine' | cut -f 3 -d' '
Run Code Online (Sandbox Code Playgroud)
尽管这不满足“通用解决方案”的要求。使用rev两次我们也可以解决这个问题:
echo -e 'one two three\nfour five six\nseven eight nine' | rev | cut -f 1 -d' ' | rev
Run Code Online (Sandbox Code Playgroud)