有效地将项目并行添加到列表<class>并行C#

ccs*_*csv 4 c# linq parallel-processing list

我有一个功能代码,它将一个属性的字符串拆分为类的列表:Dataframe由string, string, string.

现在我声明一个空的Dataframe2(string,string[], string)并使用将项添加到列表中Add

class Program

{


    public static string[] SPString(string text)
    {
        string[] elements;
        elements = text.Split(' ');
        return elements;
    }

    //Structures
    public class Dataframe
    {

        public string Name { get; set; }
        public string Text { get; set; }
        public string Cat { get; set; }
    }

    public class Dataframe2
    {

        public string Name { get; set; }
        public string[] Text { get; set; }
        public string Cat { get; set; }
    }



    static void Main(string[] args)
    {

        List<Dataframe> doc = new List<Dataframe>{new Dataframe { Name = "Doc1", Text = "The quick brown cat", Cat = ""},
            new Dataframe { Name = "Doc2", Text = "The big fat cat", Cat = "Two"},
            new Dataframe { Name = "Doc4", Text = "The quick brown rat", Cat = "One"},
            new Dataframe { Name = "Doc3", Text = "Its the cat in the hat", Cat = "Two"},
            new Dataframe { Name = "Doc5", Text = "Mice and rats eat seeds", Cat = "One"},
        };

        // Can this be made more efficient?
        ConcurrentBag<Dataframe2> doc2 = new ConcurrentBag<Dataframe2>();
        Parallel.ForEach(doc, entry =>
        {
            string s = entry.Text;
            string[] splitter = SPString(s);
            doc2.Add(new Dataframe2 {Name = entry.Name, Text = splitter, Cat =entry.Cat});
        } );

    }
}
Run Code Online (Sandbox Code Playgroud)

是否有更有效的方法使用并行LINQ将内容添加到列表中,其中Dataframe2继承了我没有修改的属性?

Dmi*_*nko 5

您可以尝试使用PLinq添加并行性并保留List<T>:

// Do NOT create and then fill the List<T> (which is not thread-safe) in parallel manually,
// Let PLinq do it for you
List<Dataframe2> doc2 = doc
  .AsParallel()
  .Select(entry => {
     //TODO: make Dataframe2 from given Dataframe (entry)
     ...
     return new Dataframe2 {Name = entry.Name, Text = splitter, Cat = entry.Cat};
  }) 
  .ToList();
Run Code Online (Sandbox Code Playgroud)

  • @Jodrell:它取决于`doc`列表的大小,`Split'的评估成本(即字符串的长度)等.最好的,恕我直言,选择是*注释*`AsParallel()`并查看. (2认同)