从文件读取时正则表达式分裂

Fah*_*jua 5 c# regex asp.net string

我有一个文本文件,我正在逐行阅读.

我想用','分割一行.

但我希望跳过引号""内的逗号.

我试过跟随正则表达式,但它无法正常工作.

怎么做.

文件的内容是

"Mobile","Custom1","Custom2","Custom3","First Name"
"61402818083","service","in Portsmith","is","First Name"
"61402818083","service","in Parramatta Park","is","First Name"
"61402818083","services","in postcodes 3000, 4000","are","First Name"
"61402818083","services","in postcodes 3000, 4000, 5000","are","First Name"
"61402818083","services",,"are","First Name"
Run Code Online (Sandbox Code Playgroud)

正则表达式如下

,(?=([^\"]*\"[^\"]*\")*[^\"]*$)
Run Code Online (Sandbox Code Playgroud)

此正则表达式输出第5行的以下内容

"61402818083"
,"First Name"
"services"
,"First Name"
"in postcodes 3000, 4000, 5000"
,"First Name"
"are"
"First Name"
"First Name"
Run Code Online (Sandbox Code Playgroud)

结果应如下

"61402818083"
"services"
"in postcodes 3000, 4000, 5000"
"are"
"First Name"
Run Code Online (Sandbox Code Playgroud)

dav*_*s86 5

不要重新发明轮子.似乎您正在尝试解析逗号分隔文件(即使文件扩展名与csv不同).试试这个.

using (TextFieldParser reader = new TextFieldParser(@"c:\yourpath\file.csv"))
{
    reader.TextFieldType = FieldType.Delimited;
    reader.SetDelimiters(",");
    while (!reader.EndOfData) 
    {
        //Processing a line of the file
        string[] fields = reader.ReadFields();
        // now fields contains 5 elements, e.g.
        // fields[0] = "61402818083"
        // fields[1] = "services"
        // fields[2] = "in postcodes 3000, 4000, 5000"
        // fields[3] = "are"
        // fields[4] = "First Name"
    }
}
Run Code Online (Sandbox Code Playgroud)

注意

需要Microsoft.VisualBasic在项目中添加参考

  • 无论如何,您可以尝试,您的文件格式与csv相同. (3认同)

Jen*_*iya 4

using System;
using System.Text.RegularExpressions;

public class Program
{
    public static void Main()
    {
        string line = "\"61402818083\",\"services\",\"in postcodes 3000, 4000\",\"are\",\"First Name\"";
        var reg = new Regex("\".*?\"");
        var matches = reg.Matches(line);
        foreach (var item in matches)
        {
            Console.WriteLine(item.ToString());
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

输出:

"61402818083"
"services"
"in postcodes 3000, 4000"
"are"
"First Name"
Run Code Online (Sandbox Code Playgroud)

https://dotnetfiddle.net/5GxxIo

另一种可能的解决方案:

using System;
using System.Text.RegularExpressions;

public class Program
{
    public static void Main()
    {
        string line = "\"61402818083\",\"services\",\"in postcodes 3000, 4000\",\"are\",\"First Name\"";
        Console.WriteLine(line.ToString());
        var reg = new Regex("(?:^|,)(\"(?:[^\"]+|\"\")*\"|[^,]*)", RegexOptions.Compiled);
        var matches = reg.Matches(line);
        foreach (Match match in reg.Matches(line))
        {
            Console.WriteLine(match.Value.TrimStart(','));
        }
    }
}
Run Code Online (Sandbox Code Playgroud)

https://dotnetfiddle.net/rRml2D