删除除记事本 ++ 中的 URL 之外的所有内容

Question

删除除记事本 ++ 中的 URL 之外的所有内容

使用合法的 Chrome 插件手动抓取 Google 搜索结果后，我有以下信息（仅用于两个搜索结果）：

The History Teacher (@THTjournal) | Twitter
https://twitter.com/thtjournal  https://twitter.com/thtjournal
Vertaal deze pagina https://translate.google.nl/translate?hl=nl&sl=en&u=https://twitter.com/thtjournal&prev=search
Jim Carroll (@jcarrollhistory) | Twitter
https://twitter.com/jcarrollhistory https://twitter.com/jcarrollhistory
Vertaal deze pagina https://translate.google.nl/translate?hl=nl&sl=en&u=https://twitter.com/jcarrollhistory&prev=search

Run Code Online (Sandbox Code Playgroud)

我的目标是创建一个包含 Twitter URL 的列表，如下所示：

https://twitter.com/thtjournal

https://twitter.com/jcarrollhistory

Run Code Online (Sandbox Code Playgroud)

我有 Notepad++，那么如何使用它来获取仅包含 URL 的列表？其他所有内容都应删除。

Answer 1

Tot*_*oto 3

Ctrl+H
找什么：^.*?(\bhttps://twitter\.com/\w+)?.*$
用。。。来代替：(?1$1:)
检查环绕
检查正则表达式
不要检查. matches newline
Replace all

解释：

^                           # beginning of line
  .*?                       # 0 or more any character but newline, not greedy
  (                         # start grpup 1
    \b                      # word boundary
    https://twitter\.com/   # literally
    \w+                     # 1 or more word character
  )?                        # end group, optional
  .*                        # 0 or more any character but newline
$                           # end of line

Run Code Online (Sandbox Code Playgroud)

替代品：

(?1$1:)         # if group 1 exists, then use it as replacement, else replace with nothing

Run Code Online (Sandbox Code Playgroud)

给定示例的结果：

https://twitter.com/thtjournal


https://twitter.com/jcarrollhistory

Run Code Online (Sandbox Code Playgroud)

归档时间：	7 年，6 月前
查看次数：	3791 次
最近记录：	7 年，6 月前