php regex:获取src值

2 php regex

如何在php中使用正则表达式检索所有src值?

<script type="text/javascript" src="http://localhost/assets/javascript/system.js" charset="UTF-8"></script>
<script type='text/javascript' src='http://localhost/index.php?uid=93db46d877df1af2a360fa2b04aabb3c' charset='UTF-8'></script>
Run Code Online (Sandbox Code Playgroud)

检索到的值应仅包含:

谢谢.

Sco*_*den 7

/src=(["'])(.*?)\1/
Run Code Online (Sandbox Code Playgroud)

例:

<?php

$input_string = '<script type="text/javascript" src="http://localhost/assets/javascript/system.js" charset="UTF-8"></script>';
$count = preg_match('/src=(["\'])(.*?)\1/', $input_string, $match);
if ($count === FALSE) 
    echo('not found\n');
else 
    echo($match[2] . "\n");

$input_string = "<script type='text/javascript' src='http://localhost/index.php?uid=93db46d877df1af2a360fa2b04aabb3c' charset='UTF-8'></script>";
$count = preg_match('/src=(["\'])(.*?)\1/', $input_string, $match);
if ($count === FALSE) 
    echo('not found\n');
else 
    echo($match[2] . "\n");
Run Code Online (Sandbox Code Playgroud)

得到:

http://localhost/assets/javascript/system.js
http://localhost/index.php?uid=93db46d877df1af2a360fa2b04aabb3c
Run Code Online (Sandbox Code Playgroud)


Nic*_*sta 7

也许只是我,但我不喜欢使用正则表达式来查找HTML中的内容,尤其是当HTML不可预测时(可能来自用户或其他网页).

这样的事情怎么样:

$doc =
<<<DOC
    <script type="text/javascript" src="http://localhost/assets/javascript/system.js" charset="UTF-8"></script>
    <script type='text/javascript' src='http://localhost/index.php?uid=93db46d877df1af2a360fa2b04aabb3c' charset='UTF-8'></script>
Run Code Online (Sandbox Code Playgroud)
DOC;

$dom = new DomDocument;
$dom->loadHTML( $doc );

$elems = $dom->getElementsByTagName('*');

foreach ( $elems as $elm ) {
    if ( $elm->hasAttribute('src') )
        $srcs[] = $elm->getAttribute('src');
}

print_r( $srcs );
Run Code Online (Sandbox Code Playgroud)

我不知道这与正则表达式之间的速度差异是什么,但是我花了很多时间阅读它并理解我正在尝试做什么.