My code is ok, but I'm wondering which style is better, how you will see, I'm playing with async methods.
Let me establish the context:
Parallel.ForEach(xmlAnimalList, async xml =>
{
taskList.Add(await Task.FromResult(ReadAnimalXML(xml, token)));
});
Run Code Online (Sandbox Code Playgroud)
this piece of code works pretty good with this method:
public async Task<Animal> ReadAnimalXML(string filename, CancellationToken token)
Run Code Online (Sandbox Code Playgroud)
In the previous example you can see the Task.FromResult() right after await keyword. ReadAnimalXML method only returns a:
return new Animal();
Run Code Online (Sandbox Code Playgroud)
The second example is this:
Parallel.ForEach(xmlAnimalList, async xml =>
{
taskList.Add(await ReadAnimalXML2(xml, token));
});
Run Code Online (Sandbox Code Playgroud)
this time the ReadAnimalXML2 method returns this:
public async Task<Task<Animal>> ReadAnimalXML2(string filename, CancellationToken token)
{
return Task.FromResult(new Animal());
}
Run Code Online (Sandbox Code Playgroud)
BUT!
The second method ReadAnimalXML2 (which it seems really weird to me) returns a
Task<Task<Animal>>
Run Code Online (Sandbox Code Playgroud)
A task inside of a task.
That's the reason why I return a Task.FromResult(new Animal()); otherwise it won't work. Both ways are fine, but one is better. Could you share your answer, and explain why?
I thank you that you have entered to see the question. Coding is FUN!
public async Task<IEnumerable<Animal>> ReadXMLFromFolderAsync(string folderPath, CancellationToken token)
{
if (!Directory.Exists(folderPath))
{
return new List<Animal>();
}
List<Task<Animal>> taskList = new List<Task<Animal>>();
List<string> xmlAnimalList = Directory.GetFiles(folderPath, "*.xml").ToList();
Parallel.ForEach(xmlAnimalList, async xml =>
{
taskList.Add(await Task.FromResult(ReadAnimalXML(xml, token)));
});
return await Task.WhenAll(taskList);
}
public async Task<Animal> ReadAnimalXML(string filename, CancellationToken token)
{
XDocument document = XDocument.Load(filename);
IEnumerable<XElement> ADN = await Task.Run(() =>
document.Descendants("ADN").Where(adn => adn.Name.LocalName == "Dinosaur"), token);
//populate the animal object
return new Animal();
}
public async Task<IEnumerable<Animal>> ReadXMLFromFolderAsync2(string folderPath, CancellationToken token)
{
if (!Directory.Exists(folderPath))
{
return new List<Animal>();
}
List<Task<Animal>> taskList = new List<Task<Animal>>();
List<string> xmlAnimalList = Directory.GetFiles(folderPath, "*.xml").ToList();
Parallel.ForEach(xmlAnimalList, async xml =>
{
taskList.Add(await ReadAnimalXML2(xml, token));
});
return await Task.WhenAll(taskList);
}
public async Task<Task<Animal>> ReadAnimalXML2(string filename, CancellationToken token)
{
XDocument document = XDocument.Load(filename);
IEnumerable<XElement> ADN = await Task.Run(() => document
.Descendants("ADN")
.Where(adn => adn
.Name
.LocalName == "Dinosaur")
, token);
//populate the animal object
return Task.FromResult(new Animal());
}
Run Code Online (Sandbox Code Playgroud)
我认为您正在将并行性与异步性混淆,并且两者都不正确。
如果您的方法返回Task.FromResult,则它不是异步的。如果要使用异步代码,请专注于I / O-例如,异步执行I / O以从磁盘加载文件数据,然后(同步)将其解析为XML。
这些Parallel.ForEach问题更加危险。首先,您不能使用async带有Parallel.ForEach;的方法。您的代码恰好起作用,因为您的async方法不是异步的。另外,您不能List<T>.Add在并行代码中使用非线程安全的方法。因此,几乎所有使用的代码Parallel.ForEach都是错误的。但是您可能仍然不需要Parallel.ForEach。
如果要进行异步并发,则只需要LINQ Select和await Task.WhenAll。如果要建立一种可以并行处理的管道,则可以使用TPL Dataflow,也可以在完成异步部分之后Parallel.ForEach 仅与同步代码一起使用。