写入具有多个流的文件C#

sho*_*der 5 c# io multithreading file parallel.for

我试图使用HTTP从一台服务器下载一个大文件(> 1GB).为此,我将并行处理HTTP范围请求.这让我可以并行下载文件.

保存到磁盘时,我正在接收每个响应流,打开文件流相同的文件,寻找我想要的范围然后写入.

但是我发现除了我的一个响应流之外的所有响应都会超时.它看起来像磁盘I/O无法与网络I/O跟上.但是,如果我做同样的事情,但让每个线程写入一个单独的文件,它工作正常.

作为参考,这是我写入同一文件的代码:

int numberOfStreams = 4;
List<Tuple<int, int>> ranges = new List<Tuple<int, int>>();
string fileName = @"C:\MyCoolFile.txt";
//List populated here
Parallel.For(0, numberOfStreams, (index, state) =>
{
    try
    {
        HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("Some URL");
        using(Stream responseStream = webRequest.GetResponse().GetResponseStream())
        {
            using (FileStream fileStream = File.Open(fileName, FileMode.OpenOrCreate, FileAccess.Write, FileShare.Write))
            {
                fileStream.Seek(ranges[index].Item1, SeekOrigin.Begin);
                byte[] buffer = new byte[64 * 1024];
                int bytesRead;
                while ((bytesRead = responseStream.Read(buffer, 0, buffer.Length)) > 0)
                {
                    if (state.IsStopped)
                    {
                        return;
                    }
                    fileStream.Write(buffer, 0, bytesRead);
                }
            }
        };
    }
    catch (Exception e)
    {
        exception = e;
        state.Stop();
    }
});
Run Code Online (Sandbox Code Playgroud)

以下是写入多个文件的代码:

int numberOfStreams = 4;
List<Tuple<int, int>> ranges = new List<Tuple<int, int>>();
string fileName = @"C:\MyCoolFile.txt";
//List populated here
Parallel.For(0, numberOfStreams, (index, state) =>
{
    try
    {
        HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("Some URL");
        using(Stream responseStream = webRequest.GetResponse().GetResponseStream())
        {
            using (FileStream fileStream = File.Open(fileName + "." + index + ".tmp", FileMode.OpenOrCreate, FileAccess.Write, FileShare.Write))
            {
                fileStream.Seek(ranges[index].Item1, SeekOrigin.Begin);
                byte[] buffer = new byte[64 * 1024];
                int bytesRead;
                while ((bytesRead = responseStream.Read(buffer, 0, buffer.Length)) > 0)
                {
                    if (state.IsStopped)
                    {
                        return;
                    }
                    fileStream.Write(buffer, 0, bytesRead);
                }
            }
        };
    }
    catch (Exception e)
    {
        exception = e;
        state.Stop();
    }
});
Run Code Online (Sandbox Code Playgroud)

我的问题是,在从多个线程写入单个文件时,C#/ Windows是否会进行一些额外的检查/操作,这会导致文件I/O比写入多个文件时慢?所有磁盘操作都应该受磁盘速度的限制吗?谁能解释这种行为?

提前致谢!

更新:这是源服务器抛出的错误:

"无法将数据写入传输连接:连接尝试失败,因为连接方在一段时间后没有正确响应,或者建立的连接失败,因为连接的主机无法响应." [System.IO.IOException]:"无法将数据写入传输连接:连接尝试失败,因为连接方在一段时间后没有正确响应,或者由于连接主机无法响应而建立连接失败." InnerException:"连接尝试失败,因为连接方在一段时间后没有正确响应,或者建立的连接失败,因为连接的主机无法响应"消息:"无法将数据写入传输连接:

sho*_*der 1

因此,在尝试了所有建议之后,我最终使用MemoryMappedFile并打开一个流来写入MemoryMappedFile每个线程上的:

int numberOfStreams = 4;
List<Tuple<int, int>> ranges = new List<Tuple<int, int>>();
string fileName = @"C:\MyCoolFile.txt";
//Ranges list populated here
using (MemoryMappedFile mmf = MemoryMappedFile.CreateFromFile(fileName, FileMode.OpenOrCreate, null, fileSize.Value, MemoryMappedFileAccess.ReadWrite))
{
    Parallel.For(0, numberOfStreams, index =>
    {
        try
        {
            HttpWebRequest webRequest = (HttpWebRequest)WebRequest.Create("Some URL");
            using(Stream responseStream = webRequest.GetResponse().GetResponseStream())
            {
                using (MemoryMappedViewStream fileStream = mmf.CreateViewStream(ranges[index].Item1, ranges[index].Item2 - ranges[index].Item1 + 1, MemoryMappedFileAccess.Write))
                {
                    responseStream.CopyTo(fileStream);
                }
            };
        }
        catch (Exception e)
        {
            exception = e;
        }
    });
}
Run Code Online (Sandbox Code Playgroud)