sho*_*der 5 c# io multithreading file parallel.for
我试图使用HTTP从一台服务器下载一个大文件(> 1GB).为此,我将并行处理HTTP范围请求.这让我可以并行下载文件.
保存到磁盘时,我正在接收每个响应流,打开与文件流相同的文件,寻找我想要的范围然后写入.
但是我发现除了我的一个响应流之外的所有响应都会超时.它看起来像磁盘I/O无法与网络I/O跟上.但是,如果我做同样的事情,但让每个线程写入一个单独的文件,它工作正常.
作为参考,这是我写入同一文件的代码:
int numberOfStreams = 4;
List<Tuple<int, int>> ranges = new List<Tuple<int, int>>();
string fileName = @"C:\MyCoolFile.txt";
//List populated here
Parallel.For(0, numberOfStreams, (index, state) =>
{
try
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("Some URL");
using(Stream responseStream = webRequest.GetResponse().GetResponseStream())
{
using (FileStream fileStream = File.Open(fileName, FileMode.OpenOrCreate, FileAccess.Write, FileShare.Write))
{
fileStream.Seek(ranges[index].Item1, SeekOrigin.Begin);
byte[] buffer = new byte[64 * 1024];
int bytesRead;
while ((bytesRead = responseStream.Read(buffer, 0, buffer.Length)) > 0)
{
if (state.IsStopped)
{
return;
}
fileStream.Write(buffer, 0, bytesRead);
}
}
};
}
catch (Exception e)
{
exception = e;
state.Stop();
}
});
Run Code Online (Sandbox Code Playgroud)
以下是写入多个文件的代码:
int numberOfStreams = 4;
List<Tuple<int, int>> ranges = new List<Tuple<int, int>>();
string fileName = @"C:\MyCoolFile.txt";
//List populated here
Parallel.For(0, numberOfStreams, (index, state) =>
{
try
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("Some URL");
using(Stream responseStream = webRequest.GetResponse().GetResponseStream())
{
using (FileStream fileStream = File.Open(fileName + "." + index + ".tmp", FileMode.OpenOrCreate, FileAccess.Write, FileShare.Write))
{
fileStream.Seek(ranges[index].Item1, SeekOrigin.Begin);
byte[] buffer = new byte[64 * 1024];
int bytesRead;
while ((bytesRead = responseStream.Read(buffer, 0, buffer.Length)) > 0)
{
if (state.IsStopped)
{
return;
}
fileStream.Write(buffer, 0, bytesRead);
}
}
};
}
catch (Exception e)
{
exception = e;
state.Stop();
}
});
Run Code Online (Sandbox Code Playgroud)
我的问题是,在从多个线程写入单个文件时,C#/ Windows是否会进行一些额外的检查/操作,这会导致文件I/O比写入多个文件时慢?所有磁盘操作都应该受磁盘速度的限制吗?谁能解释这种行为?
提前致谢!
更新:这是源服务器抛出的错误:
"无法将数据写入传输连接:连接尝试失败,因为连接方在一段时间后没有正确响应,或者建立的连接失败,因为连接的主机无法响应." [System.IO.IOException]:"无法将数据写入传输连接:连接尝试失败,因为连接方在一段时间后没有正确响应,或者由于连接主机无法响应而建立连接失败." InnerException:"连接尝试失败,因为连接方在一段时间后没有正确响应,或者建立的连接失败,因为连接的主机无法响应"消息:"无法将数据写入传输连接:
因此,在尝试了所有建议之后,我最终使用MemoryMappedFile
并打开一个流来写入MemoryMappedFile
每个线程上的:
int numberOfStreams = 4;
List<Tuple<int, int>> ranges = new List<Tuple<int, int>>();
string fileName = @"C:\MyCoolFile.txt";
//Ranges list populated here
using (MemoryMappedFile mmf = MemoryMappedFile.CreateFromFile(fileName, FileMode.OpenOrCreate, null, fileSize.Value, MemoryMappedFileAccess.ReadWrite))
{
Parallel.For(0, numberOfStreams, index =>
{
try
{
HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("Some URL");
using(Stream responseStream = webRequest.GetResponse().GetResponseStream())
{
using (MemoryMappedViewStream fileStream = mmf.CreateViewStream(ranges[index].Item1, ranges[index].Item2 - ranges[index].Item1 + 1, MemoryMappedFileAccess.Write))
{
responseStream.CopyTo(fileStream);
}
};
}
catch (Exception e)
{
exception = e;
}
});
}
Run Code Online (Sandbox Code Playgroud)