并行上传blob时"指定的阻止列表无效"

Jer*_*Gee 7 azure blobstorage

我有一个(相当大的)Azure应用程序,它将(相当大的)文件并行上传到Azure blob存储.

在几个百分点的上传中,我得到一个例外:

The specified block list is invalid.

System.Net.WebException: The remote server returned an error: (400) Bad Request.

这是当我们运行一个相当无害的代码来将Blob并行上传到Azure存储时:

    public static void UploadBlobBlocksInParallel(this CloudBlockBlob blob, FileInfo file) 
    {
        blob.DeleteIfExists();
        blob.Properties.ContentType = file.GetContentType();
        blob.Metadata["Extension"] = file.Extension;

        byte[] data = File.ReadAllBytes(file.FullName);

        int numberOfBlocks = (data.Length / BlockLength) + 1;
        string[] blockIds = new string[numberOfBlocks];

        Parallel.For(
            0, 
            numberOfBlocks, 
            x =>
        {
            string blockId = Convert.ToBase64String(Guid.NewGuid().ToByteArray());
            int currentLength = Math.Min(BlockLength, data.Length - (x * BlockLength));

            using (var memStream = new MemoryStream(data, x * BlockLength, currentLength))
            {
                var blockData = memStream.ToArray();
                var md5Check = System.Security.Cryptography.MD5.Create();
                var md5Hash = md5Check.ComputeHash(blockData, 0, blockData.Length);

                blob.PutBlock(blockId, memStream, Convert.ToBase64String(md5Hash));
            }

            blockIds[x] = blockId;
        });

        byte[] fileHash  = _md5Check.ComputeHash(data, 0, data.Length);
        blob.Metadata["Checksum"] = BitConverter.ToString(fileHash).Replace("-", string.Empty);
        blob.Properties.ContentMD5 = Convert.ToBase64String(fileHash);

        data = null;
        blob.PutBlockList(blockIds);
        blob.SetMetadata();
        blob.SetProperties();
    }
Run Code Online (Sandbox Code Playgroud)

一切都很神秘; 我认为我们用来计算块列表的算法应该生成长度相同的字符串......

And*_*ous 6

我们遇到了类似的问题,但是我们没有指定任何块 ID,甚至没有在任何地方使用块 ID。在我们的例子中,我们使用了:

using (CloudBlobStream stream = blob.OpenWrite(condition))
{
   //// [write data to stream]

   stream.Flush();
   stream.Commit();
}
Run Code Online (Sandbox Code Playgroud)

这将The specified block list is invalid.在并行负载下导致错误。UploadFromStream(…)在将数据缓冲到内存中时切换此代码以使用该方法修复了该问题:

using (MemoryStream stream = new MemoryStream())
{
   //// [write data to stream]

   stream.Seek(0, SeekOrigin.Begin);
   blob.UploadFromStream(stream, condition);
}
Run Code Online (Sandbox Code Playgroud)

显然,如果将太多数据缓冲到内存中,这可能会对内存产生负面影响,但这是一种简化。有一点要注意的是,UploadFromStream(...)使用Commit()在某些情况下,但检查附加条件来确定最佳的方法来使用。