我正在使用perl,需要分割由逗号分隔的作者姓名字符串以及最后一个"和".名称形成为名字和姓氏,如下所示:
$string1 = "Joe Smith, Jason Jones, Jane Doe and Jack Jones";
$string2 = "Joe Smith, Jason Jones, Jane Doe, and Jack Jones";
$string3 = "Jane Doe and Joe Smith";
# Next line doesn't work because there is no comma between last two names
@data = split(/,/, $string1);
Run Code Online (Sandbox Code Playgroud)
我只想将全名拆分为数组的元素,就像split()所做的那样,以便@data数组包含,例如:
@data[0]: "Joe Smith"
@data[1]: "Jason Jones"
@data[2]: "Jane Doe"
@data[3]: "Jack Jones"
Run Code Online (Sandbox Code Playgroud)
但是,问题是列表中的最后两个名称之间没有逗号.任何帮助,将不胜感激.
mu *_*ort 10
您可以在正则表达式中使用简单的替换进行拆分:
my @parts = split(/\s*,\s*|\s+and\s+/, $string1);
Run Code Online (Sandbox Code Playgroud)
例如:
$ perl -we 'my $string1 = "Joe Smith, Jason Jones, Jane Doe and Jack Jones";print join("\n",split(/\s*,\s*|\s+and\s+/, $string1)),"\n"'
Joe Smith
Jason Jones
Jane Doe
Jack Jones
$ perl -we 'my $string2 = "Jane Doe and Joe Smith";print join("\n",split(/\s*,\s*|\s+and\s+/, $string2)),"\n"'
Jane Doe
Joe Smith
Run Code Online (Sandbox Code Playgroud)
如果你还要处理牛津逗号(即"这个,那个和另一个"),那么你可以使用
my @parts = split(/\s*,\s*and\s+|\s*,\s*|\s+and\s+/, $string1);
Run Code Online (Sandbox Code Playgroud)
例如:
$ perl -we 'my $s = "Joe Smith, Jason Jones, Jane Doe, and Jack Jones";print join("\n",split(/\s*,\s*and\s+|\s*,\s*|\s+and\s+/, $s)),"\n"'
Joe Smith
Jason Jones
Jane Doe
Jack Jones
$ perl -we 'my $s = "Joe Smith, Jason Jones, Jane Doe and Jack Jones";print join("\n",split(/\s*,\s*and\s+|\s*,\s*|\s+and\s+/, $s)),"\n"'
Joe Smith
Jason Jones
Jane Doe
Jack Jones
$ perl -we 'my $s = "Joe Smith and Jack Jones";print join("\n",split(/\s*,\s*and\s+|\s*,\s*|\s+and\s+/, $s)),"\n"'
Joe Smith
Jack Jones
Run Code Online (Sandbox Code Playgroud)
感谢stackoverflowuser2010注意到这种情况.
你会希望\s*,\s*and\s+
在开始时保持交替的其他分支不要在逗号或"和"上分开,这个顺序似乎也是有保证的:
从左到右尝试替代方案,因此找到的整个表达式匹配的第一个替代方案是选择的方案.