如何切割由变量定义的一系列线

che*_*som 3 shell awk cut range

我有这个python爬虫输出

[+] Site to crawl: http://www.example.com
[+] Start time: 2020-05-24 07:21:27.169033
[+] Output file: www.example.com.crawler

[+] Crawling
   [-] http://www.example.com
   [-] http://www.example.com/
   [-] http://www.example.com/icons/ubuntu-logo.png
   [-] http://www.example.com/manual
    [i] 404 Not Found
[+] Total urls crawled: 4

[+] Directories found:
   [-] http://www.example.com/icons/
[+] Total directories: 1

[+] Directory with indexing
Run Code Online (Sandbox Code Playgroud)

我想使用 awk 或任何其他工具在“爬行”和“爬行的总网址”之间划清界限,所以基本上我想使用变量将 NR 分配给第一个关键字“爬行”,并将第二个变量分配给它 NR第二个限制器“爬行的总网址”的值,然后削减两者之间的范围,我尝试了这样的事情:

awk 'NR>$(Crawling) && NR<$(urls)' file.txt
Run Code Online (Sandbox Code Playgroud)

但没有什么真正奏效,我得到的最好的是从 Crawling+1 行到文件末尾的剪切,这实际上没有帮助,所以如何做以及如何使用带有变量的 awk 剪切一系列行!

awk

Rav*_*h13 5

如果我正确地满足了您的要求,您想将 shell 变量放入awk代码和搜索字符串中,然后尝试以下操作。

awk -v crawl="Crawling" -v url="Total urls crawled" '
$0 ~ url{
  found=""
  next
}
$0 ~ crawl{
  found=1
  next
}
found
'  Input_file
Run Code Online (Sandbox Code Playgroud)

说明:为以上添加详细说明。

awk -v crawl="Crawling" -v url="Total urls crawled" '   ##Starting awk program and setting crawl and url values of variables here.
$0 ~ url{                      ##Checking if line is matched to url variable then do following.
  found=""                     ##Nullify the variable found here.
  next                         ##next will skip further statements from here.
}
$0 ~ crawl{                    ##Checking if line is matched to crawl variable then do following.
  found=1                      ##Setting found value to 1 here.
  next                         ##next will skip further statements from here.
}
found                          ##Checking condition if found is SET(NOT NULL) then print current line.
'  Input_file                  ##Mentioning Input_file name here.
Run Code Online (Sandbox Code Playgroud)