如何使用HttpWebRequest获取文件并行

Tar*_*rek 10 c# task-parallel-library

我正在尝试制作像IDM这样的程序,它可以同时下载部分文件.
我用来实现这个的工具是C#.Net4.5中的TPL
但是我在使用Tasks并行操作时遇到了问题.
序列功能运行良好,正在正确下载文件.
使用Tasks的并行功能正在工作,直到发生奇怪的事情:
我创建了4个任务,Factory.StartNew()在每个任务中给出了起始位置和结束位置,任务将下载这些文件,然后它将以字节[ ],一切进展顺利,任务工作正常,但在某些时候,执行冻结,就是这样,程序停止,没有其他事情发生.
并行功能的实现:

static void DownloadPartsParallel()
    {

        string uriPath = "http://mschnlnine.vo.llnwd.net/d1/pdc08/PPTX/BB01.pptx";
        Uri uri = new Uri(uriPath);
        long l = GetFileSize(uri);
        Console.WriteLine("Size={0}", l);
        int granularity = 4;
        byte[][] arr = new byte[granularity][];
        Task<byte[]>[] tasks = new Task<byte[]>[granularity];
        tasks[0] = Task<byte[]>.Factory.StartNew(() => DownloadPartOfFile(uri, 0, l / granularity));
        tasks[1] = Task<byte[]>.Factory.StartNew(() => DownloadPartOfFile(uri, l / granularity + 1, l / granularity + l / granularity));
        tasks[2] = Task<byte[]>.Factory.StartNew(() => DownloadPartOfFile(uri, l / granularity + l / granularity + 1, l / granularity + l / granularity + l / granularity));
        tasks[3] = Task<byte[]>.Factory.StartNew(() => DownloadPartOfFile(uri, l / granularity + l / granularity + l / granularity + 1, l));//(l / granularity) + (l / granularity) + (l / granularity) + (l / granularity)


        arr[0] = tasks[0].Result;
        arr[1] = tasks[1].Result;
        arr[2] = tasks[2].Result;
        arr[3] = tasks[3].Result;
        Stream localStream;
        localStream = File.Create("E:\\a\\" + Path.GetFileName(uri.LocalPath));
        for (int i = 0; i < granularity; i++)
        {

            if (i == granularity - 1)
            {
                for (int j = 0; j < arr[i].Length - 1; j++)
                {
                    localStream.WriteByte(arr[i][j]);
                }
            }
            else
                for (int j = 0; j < arr[i].Length; j++)
                {
                    localStream.WriteByte(arr[i][j]);
                }
        }
    }
Run Code Online (Sandbox Code Playgroud)

DownloadPartOfFile函数实现:

public static byte[] DownloadPartOfFile(Uri fileUrl, long from, long to)
    {
        int bytesProcessed = 0;
        BinaryReader reader = null;
        WebResponse response = null;
        byte[] bytes = new byte[(to - from) + 1];

        try
        {
            HttpWebRequest request = (HttpWebRequest)WebRequest.Create(fileUrl);
            request.AddRange(from, to);
            request.ReadWriteTimeout = int.MaxValue;
            request.Timeout = int.MaxValue;
            if (request != null)
            {
                response = request.GetResponse();
                if (response != null)
                {
                    reader = new BinaryReader(response.GetResponseStream());
                    int bytesRead;
                    do
                    {
                        byte[] buffer = new byte[1024];
                        bytesRead = reader.Read(buffer, 0, buffer.Length);
                        if (bytesRead == 0)
                        {
                            break;
                        }
                        Array.Resize<byte>(ref buffer, bytesRead);
                        buffer.CopyTo(bytes, bytesProcessed);
                        bytesProcessed += bytesRead;
                        Console.WriteLine(Thread.CurrentThread.ManagedThreadId + ",Downloading" + bytesProcessed);
                    } while (bytesRead > 0);
                }
            }
        }
        catch (Exception e)
        {
            Console.WriteLine(e.Message);
        }
        finally
        {
            if (response != null) response.Close();
            if (reader != null) reader.Close();
        }

        return bytes;
    }
Run Code Online (Sandbox Code Playgroud)

我试图通过将int.MaxValue设置为读取超时,写入读取超时和超时来解决它,这就是程序冻结的原因,如果我不这样做,在函数DownloadPartsParallel中会发生超时异常,
所以有一个解决方案,或任何其他可能有用的建议,谢谢.

Bri*_*chl 2

好的,这就是我将如何做你正在尝试的事情。这基本上是相同的想法,只是实现方式不同。

public static void DownloadFileInPiecesAndSave()
{
    //test
    var uri = new Uri("http://www.w3.org/");

    var bytes = DownloadInPieces(uri, 4);
    File.WriteAllBytes(@"c:\temp\RangeDownloadSample.html", bytes);
}

/// <summary>
/// Donwload a file via HTTP in multiple pieces using a Range request.
/// </summary>
public static byte[] DownloadInPieces(Uri uri, uint numberOfPieces)
{
    //I'm just fudging this for expository purposes. In reality you would probably want to do a HEAD request to get total file size.
    ulong totalFileSize = 1003; 

    var pieceSize = totalFileSize / numberOfPieces;

    List<Task<byte[]>> tasks = new List<Task<byte[]>>();
    for (uint i = 0; i < numberOfPieces; i++)
    {
        var start = i * pieceSize;
        var end = start + (i == numberOfPieces - 1 ? pieceSize + totalFileSize % numberOfPieces : pieceSize);
        tasks.Add(DownloadFilePiece(uri, start, end));
    }

    Task.WaitAll(tasks.ToArray());

    //This is probably not the single most efficient way to combine byte arrays, but it is succinct...
    return tasks.SelectMany(t => t.Result).ToArray();
}

private static async Task<byte[]> DownloadFilePiece(Uri uri, ulong rangeStart, ulong rangeEnd)
{
    try
    {
        var request = (HttpWebRequest)WebRequest.Create(uri);
        request.AddRange((long)rangeStart, (long)rangeEnd);
        request.Proxy = WebProxy.GetDefaultProxy();

        using (var response = await request.GetResponseAsync())
        using (var responseStream = response.GetResponseStream())
        using (var memoryStream = new MemoryStream((int)(rangeEnd - rangeStart)))
        {
            await responseStream.CopyToAsync(memoryStream);
            return memoryStream.ToArray();
        }
    }
    catch (WebException wex)
    {
        //Do lots of error handling here, lots of things can go wrong
        //In particular watch for 416 Requested Range Not Satisfiable
        return null;
    }
    catch (Exception ex)
    {
        //handle the unexpected here...
        return null;
    }
}
Run Code Online (Sandbox Code Playgroud)

请注意,我在这里掩盖了很多内容,例如:

  • 检测服务器是否支持范围请求。如果没有,服务器将返回每个请求中的全部内容,我们将获得它的多个副本。
  • 处理任何类型的 HTTP 错误。如果第三次请求失败怎么办?
  • 重试逻辑
  • 超时
  • 计算出文件实际有多大
  • 检查文件是否足够大以保证多个请求,如果是的话有多少个?对于 1 或 2 MB 以下的文件,可能不值得并行执行此操作,但您必须进行测试
  • 很可能还有一堆其他的东西。

因此,在我将其用于生产之前,您还有很长的路要走。但它应该让您知道从哪里开始。