我试图将机器人访问日志和人工访问日志分开,所以我使用以下配置:
http {
....
map $http_user_agent $ifbot {
default 0;
"~*rogerbot" 3;
"~*ChinasoSpider" 3;
"~*Yahoo" 1;
"~*Bot" 1;
"~*Spider" 1;
"~*archive" 1;
"~*search" 1;
"~*Yahoo" 1;
"~Mediapartners-Google" 1;
"~*bingbot" 1;
"~*YandexBot" 1;
"~*Feedly" 2;
"~*Superfeedr" 2;
"~*QuiteRSS" 2;
"~*g2reader" 2;
"~*Digg" 2;
"~*trendiction" 3;
"~*AhrefsBot" 3;
"~*curl" 3;
"~*Ruby" 3;
"~*Player" 3;
"~*Go\ http\ package" 3;
"~*Lynx" 3;
"~*Sleuth" 3;
"~*Python" 3;
"~*Wget" 3;
"~*perl" 3;
"~*httrack" 3;
"~*JikeSpider" 3;
"~*PHP" 3;
"~*WebIndex" 3;
"~*magpie-crawler" 3;
"~*JUC" 3;
"~*Scrapy" 3; …Run Code Online (Sandbox Code Playgroud)