我需要查找给定的URL是否有效,如果它包含url haviing,则应该允许它
1.通用顶级域名2.国家代码顶级域名请参阅以下网址 http://en.wikipedia.org/wiki/List_of_Internet_top-level_domains
我需要在PHP中执行此操作,这是我正在做的事情
$regexUrl = "((https?|ftp)\:\/\/)?"; // SCHEME
$regexUrl .= "([a-zA-Z0-9+!*(),;?&=\$_.-]+(\:[a-zA-Z0-9+!*(),;?&=\$_.-]+)?@)?"; // User and Pass
$regexUrl .= "([a-zA-Z0-9-]+)\.([a-zA-Z]{2,3})"; // Host or IP
$regexUrl .= "(\:[0-9]{2,5})?"; // Port
$regexUrl .= "(\/([a-zA-Z0-9+\$_-]\.?)+)*\/?"; // Path
$regexUrl .= "(\?[a-zA-Z+&\$_.-][a-zA-Z0-9;:@&%=+\/\$_.-]*)?"; // GET Query
$regexUrl .= "(#[a-zA-Z_.-][a-zA-Z0-9+\$_.-]*)?"; // Anchor
//if(preg_match_all("#\bhttps?://[^\s()]+(?:\([\w\d]+\)|([^[:punct:]\s]|/))#", $message, $matches1, PREG_PATTERN_ORDER))
//$pattern = '/((https?|ftp)\:(\/\/)|(file\:\/{2,3}))?(((25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?)\.){3}(25[0-5]|2[0-4][0-9]|[01]?[0-9][0-9]?))|(((([a-zA-Z0-9]+)(\.)?)+)(\.)(com|org|net|gov|mil|biz|info|mobi|name|aero|jobs|museum|[a-z]{2}))([\/][\/a-zA-Z0-9\.]*)*([\/]?(([\?][a-zA-Z0-9]+[\=][a-zA-Z0-9\%\(\)]*)([\&][a-zA-Z0-9]+[\=][a-zA-Z0-9\%\(\)]*)*))?/';
if(preg_match_all("/$regexUrl/", $urlMessage, $matches1, PREG_PATTERN_ORDER))
{
try
{
foreach($matches1[0] as $urlToTrim1)
{
$url= $urlToTrim1;
echo $url;
}
}
catch(Exception $e)
{
$url="-1";
}
}
要弄清楚它是否通常是有效的URL:
filter_var($url, FILTER_VALIDATE_URL)
Run Code Online (Sandbox Code Playgroud)
http://www.php.net/manual/en/function.filter-var.php
如果您想确认TLD在您的批准列表中(我不知道是否filter_var到目前为止检查TLD是否确实存在):
$host = parse_url($url, PHP_URL_HOST);
$tld = substr($host, strrpos($host, '.') + 1);
// check if $tld is in a list of allowed TLDs
Run Code Online (Sandbox Code Playgroud)
或者只是尝试使用查找域的DNS记录gethostbyname.如果存在,则它是有效域.*
*除非你被DNS欺骗,否则如果这种情况对你很重要......