我正在寻找[a, b, c, "d, e, f", g, h]变成6个元素的数组:a,b,c,"d,e,f",g,h.我是RegEx的一个菜鸟,所以任何帮助都很棒.我试图通过Javascript来做到这一点.这是我到目前为止:
str = str.split(/,+|"[^"]+"/g);
Run Code Online (Sandbox Code Playgroud)
但是现在它正在拆分双引号中的所有内容,这是不正确的.谢谢你的帮助.
编辑:好的抱歉,我的措辞非常糟糕.我被给了一个字符串而不是一个数组.
var str = 'a, b, c, "d, e, f", g, h';
Run Code Online (Sandbox Code Playgroud)
我希望使用类似"拆分"功能的方法将其转换为数组.
inh*_*han 68
这就是我要做的.
var str = 'a, b, c, "d, e, f", g, h';
var arr = str.match(/(".*?"|[^",\s]+)(?=\s*,|\s*$)/g);
/* will match:
(
".*?" double quotes + anything but double quotes + double quotes
| OR
[^",\s]+ 1 or more characters excl. double quotes, comma or spaces of any kind
)
(?= FOLLOWED BY
\s*, 0 or more empty spaces and a comma
| OR
\s*$ 0 or more empty spaces and nothing else (end of string)
)
*/
arr = arr || [];
// this will prevent JS from throwing an error in
// the below loop when there are no matches
for (var i = 0; i < arr.length; i++) console.log('arr['+i+'] =',arr[i]);
Run Code Online (Sandbox Code Playgroud)
f-s*_*ety 21
正则表达式: /,(?=(?:(?:[^"]*"){2})*[^"]*$)/
const input_line = '"2C95699FFC68","201 S BOULEVARDRICHMOND, VA 23220","8299600062754882","2018-09-23"'
let my_split = input_line.split(/,(?=(?:(?:[^"]*"){2})*[^"]*$)/)[4]
Output:
my_split[0]: "2C95699FFC68",
my_split[1]: "201 S BOULEVARDRICHMOND, VA 23220",
my_split[2]: "8299600062754882",
my_split[3]: "2018-09-23"
Run Code Online (Sandbox Code Playgroud)
参考以下链接进行解释:regexr.com/44u6o
这是执行此操作的JavaScript函数:
function splitCSVButIgnoreCommasInDoublequotes(str) {
//split the str first
//then merge the elments between two double quotes
var delimiter = ',';
var quotes = '"';
var elements = str.split(delimiter);
var newElements = [];
for (var i = 0; i < elements.length; ++i) {
if (elements[i].indexOf(quotes) >= 0) {//the left double quotes is found
var indexOfRightQuotes = -1;
var tmp = elements[i];
//find the right double quotes
for (var j = i + 1; j < elements.length; ++j) {
if (elements[j].indexOf(quotes) >= 0) {
indexOfRightQuotes = j;
break;
}
}
//found the right double quotes
//merge all the elements between double quotes
if (-1 != indexOfRightQuotes) {
for (var j = i + 1; j <= indexOfRightQuotes; ++j) {
tmp = tmp + delimiter + elements[j];
}
newElements.push(tmp);
i = indexOfRightQuotes;
}
else { //right double quotes is not found
newElements.push(elements[i]);
}
}
else {//no left double quotes is found
newElements.push(elements[i]);
}
}
return newElements;
}
Run Code Online (Sandbox Code Playgroud)
这是我们用来从逗号分隔的参数列表中提取有效参数的正则表达式,支持双引号参数。它适用于概述的边缘情况。例如
(?<=")[^"]+?(?="(?:\s*?,|\s*?$))|(?<=(?:^|,)\s*?)(?:[^,"\s][^,"]*[^,"\s])|(?:[^,"\s])(?![^"]*?"(?:\s*?,|\s*?$))(?=\s*?(?:,|$))
证明:https ://regex101.com/r/UL8kyy/3/tests(注意:目前仅适用于Chrome,因为正则表达式使用仅在ECMA2018中支持的lookbehinds)
根据我们的指导方针,它避免了非捕获组和贪婪匹配。
我确定它可以简化,我愿意接受建议/其他测试用例。
对于任何感兴趣的人,第一部分匹配双引号、逗号分隔的参数:
(?<=")[^"]+?(?="(?:\s*?,|\s*?$))
第二部分自己匹配逗号分隔的参数:
(?<=(?:^|,)\s*?)(?:[^,"\s][^,"]*[^,"\s])|(?:[^,"\s])(?![^"]*?"(?:\s*?,|\s*?$))(?=\s*?(?:,|$))
我几乎喜欢接受的答案,但它没有正确解析空间,和/或它没有修剪双引号,所以这是我的函数:
/**
* Splits the given string into components, and returns the components array.
* Each component must be separated by a comma.
* If the component contains one or more comma(s), it must be wrapped with double quotes.
* The double quote must not be used inside components (replace it with a special string like __double__quotes__ for instance, then transform it again into double quotes later...).
*
* https://stackoverflow.com/questions/11456850/split-a-string-by-commas-but-ignore-commas-within-double-quotes-using-javascript
*/
function splitComponentsByComma(str){
var ret = [];
var arr = str.match(/(".*?"|[^",]+)(?=\s*,|\s*$)/g);
for (let i in arr) {
let element = arr[i];
if ('"' === element[0]) {
element = element.substr(1, element.length - 2);
} else {
element = arr[i].trim();
}
ret.push(element);
}
return ret;
}
console.log(splitComponentsByComma('Hello World, b, c, "d, e, f", c')); // [ 'Hello World', 'b', 'c', 'd, e, f', 'c' ]
Run Code Online (Sandbox Code Playgroud)
这对我来说很好。(我使用了分号,因此警报消息将显示将数组转换为字符串时添加的逗号与实际捕获的值之间的差异。)
var str = 'a; b; c; "d; e; f"; g; h; "i"';
var array = str.match(/("[^"]*")|[^;]+/g);
alert(array);
Run Code Online (Sandbox Code Playgroud)
小智 5
这是一个假设双引号成对出现的非正则表达式:
function splitCsv(str) {
return str.split(',').reduce((accum,curr)=>{
if(accum.isConcatting) {
accum.soFar[accum.soFar.length-1] += ','+curr
} else {
accum.soFar.push(curr)
}
if(curr.split('"').length % 2 == 0) {
accum.isConcatting= !accum.isConcatting
}
return accum;
},{soFar:[],isConcatting:false}).soFar
}
console.log(splitCsv('asdf,"a,d",fdsa'),' should be ',['asdf','"a,d"','fdsa'])
console.log(splitCsv(',asdf,,fds,'),' should be ',['','asdf','','fds',''])
console.log(splitCsv('asdf,"a,,,d",fdsa'),' should be ',['asdf','"a,,,d"','fdsa'])Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
44126 次 |
| 最近记录: |