如何从shell中的字符串中检索url？

Question

如何从shell中的字符串中检索url？

use*_*149 3 regex string bash shell grep

我想使用 shell/bash 脚本从字符串中提取 url，如果字符串中有多个 url，则只应返回第一个。

我在下面提供了一些输入和输出字符串的示例。我猜我需要做一些正则表达式，但我不太熟悉如何在 bash/shell 中做到这一点？

Input: Take a look at this site: http://www.google.com/ and you'll find your answer
Output: http://www.google.com/


Input: http://www.google.com
Output: http://www.google.com


Input: Check out http://www.bing.com and http://www.google.com
Output: http://www.bing.com


Input: Grettings, visit <http://www.mywebsite.com> today!
Output: http://www.mywebsite.com

Run Code Online (Sandbox Code Playgroud)

Answer 1

Ken*_*ent 5

尝试这个：

grep -Eo 'http://[^ >]+' yourFile|head -1

Run Code Online (Sandbox Code Playgroud)

例如：

kent$  echo "Check out http://www.bing.com and http://www.google.com"|grep -Eo 'http://[^ >]+'|head -1 
http://www.bing.com
kent$  echo "Grettings, visit <http://www.mywebsite.com> today"|grep -Eo 'http://[^ >]+'|head -1 
http://www.mywebsite.com

Run Code Online (Sandbox Code Playgroud)

如果将 `+` 替换为 `\+`，则不需要 `-E`。`[^ >]+` 表示任何不是（空格）或 `>`（一次或多次）的字符。如果 url 后面紧跟着 `<tab>`，你可能想添加一个 `\t` 或者如果你的 grep 支持 `-P`，使用 `-P 'http://[^\s>]+' ` . 您也可以更改为 `https?//....` 因为可能有 `https://url` (2认同)
谢谢！意味着我正确理解了那部分（但我很难向自己解释，哈哈。你的解释非常好）。我可能不会遇到标签，因为文本是从 irssi 打印的，但感谢您的说明。感谢 https 提示。我正准备做类似 http|https 的事情，但当然你的要简单得多。非常感谢您的解决方案和解释，我会确保在使用之前理解我所使用的任何内容！ (2认同)

归档时间：	12 年，8 月前
查看次数：	3369 次
最近记录：	12 年，8 月前