如何使用 shell 脚本从文件中获取 URL

Question

如何使用 shell 脚本从文件中获取 URL

我有一个包含URL的文件。我正在尝试使用 shell 脚本从该文件中获取 URL。

在文件中，网址是这样的：

('URL', 'http://url.com');

Run Code Online (Sandbox Code Playgroud)

我尝试使用以下内容：

cat file.php | grep 'URL' | awk '{ print $2 }'

Run Code Online (Sandbox Code Playgroud)

它给出的输出为：

'http://url.com');

Run Code Online (Sandbox Code Playgroud)

但我只需要进入url.comshell 脚本中的一个变量。我怎样才能做到这一点？

Answer 1

ter*_*don 14

您可以使用简单的方法完成所有操作grep：

grep -oP "http://\K[^']+" file.php

Run Code Online (Sandbox Code Playgroud)

来自man grep：

   -P, --perl-regexp
          Interpret  PATTERN  as  a  Perl  regular  expression  (PCRE, see
          below).  This is highly experimental and grep  -P  may  warn  of
          unimplemented features.
   -o, --only-matching
          Print  only  the  matched  (non-empty) parts of a matching line,
          with each such part on a separate output line.

Run Code Online (Sandbox Code Playgroud)

诀窍是\K在 Perl 正则表达式中使用which 表示discard everything matched to the left of the \K. 因此，正则表达式查找以http://（然后由于\K）开头的字符串，后跟尽可能多的非'字符。与结合使用-o，这意味着只会打印 URL。

你也可以直接在 Perl 中完成：

perl -ne "print if s/.*http:\/\/(.+)\'.*/\$1/" file.php\

Run Code Online (Sandbox Code Playgroud)

Answer 2

Fra*_*que 11

像这样的东西？

grep 'URL' file.php | rev | cut -d "'" -f 2 | rev

Run Code Online (Sandbox Code Playgroud)

或者

grep 'URL' file.php | cut -d "'" -f 4 | sed s/'http:\/\/'/''/g

Run Code Online (Sandbox Code Playgroud)

去掉http://。

或者：`cat file.php | grep '网址' | cut -d "'" -f 4`。 (3认同)
但是，这是非常低效的，解决方案 1 通过 4 个管道调用 5 个进程，而解决方案 2 通过 2 个管道调用 3 个进程，包括 2 个正则表达式。这一切都可以在 Bash shell 中完成，无需任何管道、进程或依赖项。 (2认同)

Answer 3

sou*_* c. 5

尝试这个，

awk -F// '{print $2}' file.php | cut -d "'" -f 1

Run Code Online (Sandbox Code Playgroud)

归档时间：	12 年，4 月前
查看次数：	19305 次
最近记录：	10 年，4 月前