如何使用azure存储表每秒获得更多10个插入

gab*_*bba 11 c# azure azure-table-storage

我编写简单的WorkerRole,将测试数据添加到表中.插入代码是这样的.

var TableClient = this.StorageAccount.CreateCloudTableClient();
TableClient.CreateTableIfNotExist(TableName);
var Context = TableClient.GetDataServiceContext();

this.Context.AddObject(TableName, obj);
this.Context.SaveChanges();
Run Code Online (Sandbox Code Playgroud)

此代码针对每个客户端请求运行.我用1-30个客户端线程进行测试.我有很多不同大小的实例.我不知道我做错了什么,但我无法达到每秒10次插入.如果有人知道如何提高速度请告诉我.谢谢

UPDATE

  • 删除CreateTableIfNotExist对我的插入测试没有任何影响.
  • 切换模式为expect100Continue ="false"useNagleAlgorithm ="false"在插入速率跳至30-40 ips时产生短时间效果.但是,30秒后,插入速率降至6 ips,超时率为50%.

San*_*tia 30

为了加快速度,您应该使用批处理事务(实体组事务),允许您在单个请求中提交最多100个项目:

foreach (var item in myItemsToAdd)
{
    this.Context.AddObject(TableName, item);
}
this.Context.SaveChanges(SaveChangesOptions.Batch);
Run Code Online (Sandbox Code Playgroud)

您可以将它与Partitioner.Create(+ AsParallel)结合使用,在每批100个项目的不同线程/核心上发送多个请求,以使事情变得非常快.

但在完成所有这些之前,请仔细阅读使用批处理事务的限制(100项,每个事务1个分区,......).

更新:

既然你不能在这里使用交易是一些其他的提示.在使用表存储时,请查看此MSDN线程有关提高性能的信息.我写了一些代码来向你展示差异:

    private static void SequentialInserts(CloudTableClient client)
    {
        var context = client.GetDataServiceContext();
        Trace.WriteLine("Starting sequential inserts.");

        var stopwatch = new Stopwatch();
        stopwatch.Start();

        for (int i = 0; i < 1000; i++)
        {
            Trace.WriteLine(String.Format("Adding item {0}. Thread ID: {1}", i, Thread.CurrentThread.ManagedThreadId));
            context.AddObject(TABLENAME, new MyEntity()
            {
                Date = DateTime.UtcNow,
                PartitionKey = "Test",
                RowKey = Guid.NewGuid().ToString(),
                Text = String.Format("Item {0} - {1}", i, Guid.NewGuid().ToString())
            });
            context.SaveChanges();
        }

        stopwatch.Stop();
        Trace.WriteLine("Done in: " + stopwatch.Elapsed.ToString());
    }
Run Code Online (Sandbox Code Playgroud)

所以,我第一次运行这个时得到以下输出:

Starting sequential inserts.
Adding item 0. Thread ID: 10
Adding item 1. Thread ID: 10
..
Adding item 999. Thread ID: 10
Done in: 00:03:39.9675521
Run Code Online (Sandbox Code Playgroud)

添加1000个项目需要3分钟以上.现在,我根据MSDN论坛上的提示更改了app.config(maxconnection应为12*CPU核心数):

  <system.net>
    <settings>
      <servicePointManager expect100Continue="false" useNagleAlgorithm="false"/>
    </settings>
    <connectionManagement>
      <add address = "*" maxconnection = "48" />
    </connectionManagement>
  </system.net>
Run Code Online (Sandbox Code Playgroud)

再次运行应用程序后,我得到了这个输出:

Starting sequential inserts.
Adding item 0. Thread ID: 10
Adding item 1. Thread ID: 10
..
Adding item 999. Thread ID: 10
Done in: 00:00:18.9342480
Run Code Online (Sandbox Code Playgroud)

从3分钟到18秒.有什么区别!但我们可以做得更好.以下是一些代码使用分区程序插入所有项目(插入将并行发生):

    private static void ParallelInserts(CloudTableClient client)
    {            
        Trace.WriteLine("Starting parallel inserts.");

        var stopwatch = new Stopwatch();
        stopwatch.Start();

        var partitioner = Partitioner.Create(0, 1000, 10);
        var options = new ParallelOptions { MaxDegreeOfParallelism = 8 };

        Parallel.ForEach(partitioner, options, range =>
        {
            var context = client.GetDataServiceContext();
            for (int i = range.Item1; i < range.Item2; i++)
            {
                Trace.WriteLine(String.Format("Adding item {0}. Thread ID: {1}", i, Thread.CurrentThread.ManagedThreadId));
                context.AddObject(TABLENAME, new MyEntity()
                {
                    Date = DateTime.UtcNow,
                    PartitionKey = "Test",
                    RowKey = Guid.NewGuid().ToString(),
                    Text = String.Format("Item {0} - {1}", i, Guid.NewGuid().ToString())
                });
                context.SaveChanges();
            }
        });

        stopwatch.Stop();
        Trace.WriteLine("Done in: " + stopwatch.Elapsed.ToString());
    }
Run Code Online (Sandbox Code Playgroud)

结果如下:

Starting parallel inserts.
Adding item 0. Thread ID: 10
Adding item 10. Thread ID: 18
Adding item 999. Thread ID: 16
..
Done in: 00:00:04.6041978
Run Code Online (Sandbox Code Playgroud)

瞧,我们从3分39秒开始降到18秒,现在我们甚至跌到了4s.

  • 我将用一些提示更新我的答案. (2认同)