如何切割由变量定义的一系列线

Question

如何切割由变量定义的一系列线

我有这个python爬虫输出

[+] Site to crawl: http://www.example.com
[+] Start time: 2020-05-24 07:21:27.169033
[+] Output file: www.example.com.crawler

[+] Crawling
   [-] http://www.example.com
   [-] http://www.example.com/
   [-] http://www.example.com/icons/ubuntu-logo.png
   [-] http://www.example.com/manual
    [i] 404 Not Found
[+] Total urls crawled: 4

[+] Directories found:
   [-] http://www.example.com/icons/
[+] Total directories: 1

[+] Directory with indexing

Run Code Online (Sandbox Code Playgroud)

我想使用 awk 或任何其他工具在“爬行”和“爬行的总网址”之间划清界限，所以基本上我想使用变量将 NR 分配给第一个关键字“爬行”，并将第二个变量分配给它 NR第二个限制器“爬行的总网址”的值，然后削减两者之间的范围，我尝试了这样的事情：

awk 'NR>$(Crawling) && NR<$(urls)' file.txt

Run Code Online (Sandbox Code Playgroud)

但没有什么真正奏效，我得到的最好的是从 Crawling+1 行到文件末尾的剪切，这实际上没有帮助，所以如何做以及如何使用带有变量的 awk 剪切一系列行！

awk

Answer 1

Rav*_*h13 5

如果我正确地满足了您的要求，您想将 shell 变量放入awk代码和搜索字符串中，然后尝试以下操作。

awk -v crawl="Crawling" -v url="Total urls crawled" '
$0 ~ url{
  found=""
  next
}
$0 ~ crawl{
  found=1
  next
}
found
'  Input_file

Run Code Online (Sandbox Code Playgroud)

说明：为以上添加详细说明。

awk -v crawl="Crawling" -v url="Total urls crawled" '   ##Starting awk program and setting crawl and url values of variables here.
$0 ~ url{                      ##Checking if line is matched to url variable then do following.
  found=""                     ##Nullify the variable found here.
  next                         ##next will skip further statements from here.
}
$0 ~ crawl{                    ##Checking if line is matched to crawl variable then do following.
  found=1                      ##Setting found value to 1 here.
  next                         ##next will skip further statements from here.
}
found                          ##Checking condition if found is SET(NOT NULL) then print current line.
'  Input_file                  ##Mentioning Input_file name here.

Run Code Online (Sandbox Code Playgroud)

归档时间：	5 年，5 月前
查看次数：	74 次
最近记录：	5 年，5 月前