我是regex的新手.我在regularexperssion.com上研究它.问题是我需要知道正则表达式中冒号(:)的用法是什么..
例如 ..:
$pattern = '/^(([\w]+:)?\/\/)?(([\d\w]|%[a-fA-f\d]{2,2})+(:([\d\w]|%[a-fA-f\d]{2,2})+)?@)?([\d\w][-\d\w]{0,253}[\d\w]\.)+[\w]{2,4}(:[\d]+)?(\/([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)*(\?(&?([-+_~.\d\w]|%[a-fA-f\d]{2,2})=?)*)?(#([-+_~.\d\w]|%[a-fA-f\d]{2,2})*)?$/';
Run Code Online (Sandbox Code Playgroud)
哪个匹配
$url1 = "http://www.somewebsite.com";
$url2 = "https://www.somewebsite.com";
$url3 = "https://somewebsite.com";
$url4 = "www.somewebsite.com";
$url5 = "somewebsite.com";
Run Code Online (Sandbox Code Playgroud)
是的任何帮助都会非常感激.. :)
Igo*_*bin 41
结肠:只是结肠.它没有任何意义,除了特殊情况,例如,没有捕获的聚类(也称为非捕获组):
(?:pattern)
Run Code Online (Sandbox Code Playgroud)
它也可以用在字符类中,例如:
[[:upper:]]
Run Code Online (Sandbox Code Playgroud)
但是,在你的情况下,冒号只是一个冒号.
正则表达式中使用的特殊字符:
在角色类中[-+_~.\d\w]:
- 手段 -+ 手段 +_ 手段 _~ 手段 ~. 手段 .\d 任何数字\w 意思是任何单词字符这些符号具有此含义,因为它们用于符号类[].没有符号类+,.有特殊意义.
其他要素:
=?意味着=可以发生0或1次; 换句话说=,可以发生与否,可选=.Exp*_*lls 24
我决定再给你一个并解释整个正则表达式:
^ # anchor to start of line
( # start grouping
( # start grouping
[\w]+ # at least one of 0-9a-zA-Z_
: # a literal colon
) # end grouping
? # this grouping is optional
\/\/ # two literal slashes
) # end capture
? # this grouping is optional
(
(
[\d\w] # exactly one of 0-9a-zA-Z_
# having \d is redundant
| # alternation
% # literal % sign
[a-fA-f\d]{2,2} # exactly 2 hexadecimal digits
# should probably be A-F
# using {2} would have sufficed
)+ # at least one of this groups
( # start grouping
: # literal colon
(
[\d\w]
|
%
[a-fA-f\d]{2,2}
)+
)? # Same grouping, but it is optional
# and there can be only one
@ # literal @ sign
)? # this group is optional
(
[\d\w] # same as [\w], explained above
[-\d\w]{0,253} # includes a dash as a valid character
# between 0 and 253 of these characters
[\d\w] # end with \w. They want at most 255
# total and - cannot be at the start
# or end
\. # literal period
)+ # at least one of these groups
[\w]{2,4} # two to four \w characters
(
: # literal colon
[\d]+ # at least one digit
)?
(
\/ # literal slash
(
[-+_~.\d\w] # one of these characters
| # *or*
% # % with two hex digit combo
[a-fA-f\d]{2,2}
)* # zero or more of these groups
)* # zero or more of these groups
(
\? # literal question mark
(
&? # literal & or &
(
[-+_~.\d\w]
|
%
[a-fA-f\d]{2,2}
)
=? # optional literal =
)* # zero or more of this group
)? # this group is optional
(
# # literal #
(
[-+_~.\d\w]
|
%
[a-fA-f\d]{2,2}
)*
)?
$ # anchor to end of line
Run Code Online (Sandbox Code Playgroud)
了解元字符/序列是什么很重要.某些序列在某些上下文中使用时不是元(特别是字符类).我已经为你编目了:
^ - 零宽度起始线() - 分组/捕获? - 零个或前一个序列之一+ - 前述序列中的一个或多个* - 前面序列中的零个或多个[] - 角色类\w- 字母数字字符和_.的反面\W| - 交替{} - 长度断言$ - 零宽度行尾这排除了:,@并且%在原始上下文中具有任何特殊/元含义.
]结束角色类. -创建一系列字符,除非它位于字符类的开头或结尾.
一个(?组合开始分组断言.例如,(?:表示组但不捕获.这意味着在正则表达式中/(?:a)/,它将匹配字符串"a",但a不会捕获用于替换或匹配组的字符串/(a)/.
?也可用于先行/向后断言与?=,?!,?<=,?<!. (?其次是任何序列,除了我在本节中提到的只是一个文字?.
:在您的情况下,冒号没有特殊用途:
(([\w]+:)?\/\/)?将匹配http://,https://,ftp://...
您可以找到冒号的一个特殊用途:开始时的每个捕获组(?:都不会出现在结果中.
例如,输入中使用"foobarbaz":
/foo((bar)(baz))/ => { [1] => 'barbaz', [2] => 'bar', [3] => 'baz' }/foo(?:(bar)(baz))/ => { [1] => 'bar', [2] => 'baz' }