我有记录。记录具有以下格式:
<client host> - - [<timestamp with timezone>] <HTTP-request line (type, URL, version)> <Code of HTTP-response> <Number of sent bytes or '-', if the response is empty> <Referer string ('-' means direct request without referer)> <Client info (browser, application)>
Run Code Online (Sandbox Code Playgroud)
例如这 5 行:
20158147070.user.veloxzone.com.br - - [29/Oct/2006:06:59:18 -0700] "GET /example/.comments
HTTP/1.1" 404 293 "http://www.example.org/example/" "Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.1; SV1)"
20158147070.user.veloxzone.com.br - - [29/Oct/2006:06:59:18 -0700] "GET /example/.comments
HTTP/1.1" 404 293 "http://www.example.org/example/" "Mozilla/4.0 (compatible; MSIE 6.0;
Windows NT 5.1; SV1)" …
Run Code Online (Sandbox Code Playgroud) gawk '$9=="404" || $9=="403"' log.txt | gawk '{print $7}' | sort -k7 | uniq -c | sort -nr
Run Code Online (Sandbox Code Playgroud)
输出:
28 /example/.comments
9 /example/example.atom.xml
8 /example/When/200x/2003/04/10/-big/Concorde.jpg
7 /example/When/200x/2006/03/30/-big/IMG_4613.jpg
6 /example/When/200x/2003/07/25/-big/guild-2.jpg
5 /example/Patti-Smith.png
Run Code Online (Sandbox Code Playgroud)
如何在名称 url 后打印 uniq 编号?并用数字打印
1. /example/.comments - 28
2. /example/example.atom.xml - 9
3. /example/When/200x/2003/04/10/-big/Concorde.jpg - 8
4. /example/When/200x/2006/03/30/-big/IMG_4613.jpg - 7
5. /example/When/200x/2003/07/25/-big/guild-2.jpg - 6
6. /example/Patti-Smith.png - 5
7. /example/IMGP4289-2.png - 5
8. /example/IMGP4287.png - 5
9. /example/Image-Search-Mystery.png - 5
10. /example/Horses.png - 5
11. /example/When/200x/2004/02/27/-big/Unreal.png - 4
Run Code Online (Sandbox Code Playgroud)