获取对数组内部结构的引用

Question

获取对数组内部结构的引用

我想修改一个结构的字段,该结构在一个数组内,而不必设置整个结构.在下面的示例中,我想在数组中设置元素543的一个字段.我不想复制整个元素(因为复制MassiveStruct会损害性能).

class P
{
    struct S
    {
      public int a;
      public MassiveStruct b;
    }

    void f(ref S s)
    {
      s.a = 3;
    }

    public static void Main()
    {
      S[] s = new S[1000];
      f(ref s[543]);  // Error: An object reference is required for the non-static field, method, or property
    }
}

Run Code Online (Sandbox Code Playgroud)

有没有办法在C#中做到这一点？或者我总是要从数组中复制整个结构,修改副本,然后将修改后的副本放回到数组中.

Answer 1

Gle*_*den 55

[ 编辑2017: 在本帖末尾看到关于C#7的重要评论 ]

经过多年与这个确切问题的斗争,我将总结我发现的一些技术和解决方案.除了文体品味之外,结构阵列实际上是C#中可用的大容量存储方法.如果您的应用程序在高吞吐量条件下真正处理数百万个中型对象,则没有其他可管理的替代方案.

我同意@kaalus对象标题和GC压力可以快速安装; 在解析和/或生成冗长的自然语言句子时,我的NLP语法处理系统可以在不到一分钟的时间内操纵8-10千兆字节(或更多)的结构分析.提示合唱:"C#不适用于此类问题...","切换到汇编语言......","绕线FPGA ......"等.

好吧,让我们进行一些测试.首先,要全面了解价值型(struct)管理问题以及class与struct权衡利弊的关系,这一点至关重要.当然还有装箱,钉扎/不安全代码,固定缓冲区GCHandle, IntPtr,等等,但最重要的是在我看来,明智地使用托管指针(又名 "内部指针").

Your mastery of these topics will also include knowledge of the fact that, should you happen to include in your struct one or more references to managed types (as opposed to just blittable primitives), then your options for accessing the struct with unsafe pointers are greatly reduced. This is not a problem for the managed pointer method I'll mention below. So generally, including object references is fine and doesn't change much regarding this discussion.

哦,如果确实需要保留unsafe访问权限,可以使用GCHandle"正常"模式无限期地在结构中存储对象引用.幸运的是,将GCHandle结构放入您的结构不会触发不安全访问禁止.(注意,GCHandle它本身就是一个值类型,你甚至可以定义并去城里

var gch = GCHandle.Alloc("spookee",GCHandleType.Normal);
GCHandle* p = &gch;
String s = (String)p->Target;

Run Code Online (Sandbox Code Playgroud)

......等等.作为一种值类型,GCHandle直接映射到您的结构中,但显然它存储的任何引用类型都不是.它们在堆中,不包含在数组的物理布局中.最后在GCHandle上,请注意它的复制语义,因为如果你最终没有Free分配每个GCHandle ,你将会有内存泄漏.

@Ani提醒我们,有些人认为可变struct实例是"邪恶的",但事实上他们很容易发生意外事故.确实,OP的例子......

s[543].a = 3;

Run Code Online (Sandbox Code Playgroud)

...说明了我们正在努力实现的目标:现场访问我们的数据记录.(注意:参考类型' class'实例数组的语法具有相同的外观,但在本文中我们仅在此讨论用户定义的值类型的非锯齿状数组.)对于我自己的程序,我通常如果我遇到一个超大的blittable结构(意外地)从其数组存储行中完全成像,则认为这是一个严重的错误:

rec no_no = s[543];   // don't do
no_no.a = 3           // it like this

As far as how big (wide) your struct can or should be, it won't matter, because you are going to be careful never to let the struct do what was just shown in the previous example, that is, migrate in-toto out of its embedding array. In fact, this points to a fundamental premise of this entire article:

rule:
For arrays of struct, always access individual fields in-situ; never "mention" the struct instance itself in-toto.

Unfortunately, the C# language offers no way to systematically flag or forbid code that violates this rule, so success here generally depends on careful programming discipline.

由于我们的"巨型结构"从未从阵列中成像,因此它们实际上只是内存模板.换句话说,右边的想法是设想的struct作为覆盖数组元素.我们总是认为每个都是一个空的"内存模板",而不是可传输或便携式封装器或数据容器.对于数组绑定的"巨型"值类型,我们永远不想调用" struct"的最大存在特征,即传值.

例:

public struct rec
{
    public int a, b, c, d, e, f;
}

Run Code Online (Sandbox Code Playgroud)

在这里,我们覆盖6 int秒,每个"记录"总共24个字节.您需要考虑并注意包装选项以获得对齐友好的尺寸.但过多的填充可能会削减内存预算:因为更重要的考虑因素是非LOH对象的85,000字节限制.确保您的记录大小乘以预期的行数不超过此限制.

So for the example given here, you would be best advised to keep your array of recs to no more 3,000 rows each. Hopefully your application can be designed around this sweet-spot. This is not so limiting when you remember that--alternatively--each row would be a separate garbage-collected object, instead of just the one array. You've cut your object proliferation by a three orders of magnitude, which is good for a day's work. Thus the .NET environment here is strongly steering us with a pretty specific constraint: it seems that if you target your app's memory design towards monolithic allocations in the 30-70 KB range, then you really can get away with lots and lots of them, and in fact you'll instead become limited by a thornier set of performance bottlenecks (namely, bandwidth on the hardware bus).

所以现在你有一个.NET引用类型(数组),在物理上连续的表格存储中有3,000个6元组.首先,我们必须非常小心,不要 "捡起"其中一个结构.正如Jon Skeet在上面指出的那样,"大规模的结构通常会比类更糟糕",这绝对是正确的.没有更好的方法来瘫痪你的内存总线,而不是开始在威利不知道的地方投掷丰富的价值类型.

因此,让我们利用结构数组中不常提到的方面:整个数组的所有行的所有对象(以及那些对象或结构的字段)始终初始化为其默认值.您可以开始在数组中的任何位置或列(字段)中一次一个地插入值.您可以将某些字段保留为默认值,或者替换相邻字段而不会打扰中间的字段.在使用之前,堆栈驻留(局部变量)结构需要烦人的手动初始化.

有时很难保持逐场的方法,因为.NET总是试图让我们在整个new'd-up结构中爆炸- 但对我来说,这种所谓的"初始化"只是违反我们的禁忌(反对从阵列中取出整个结构),以不同的形式.

现在我们谈到问题的症结所在.显然,原地访问表格数据可以最大限度地减少数据重组繁忙工作.但通常这是一个不方便的麻烦.由于边界检查,数组访问在.NET中可能很慢.那么如何做你保持一个"工作"指针到一个数组的内部,从而避免使系统不断重新计算索引偏移.

评估

Let's evaluate the performance of five different methods for the manipulation of individual fields within value-type array storage rows. The test below is designed to measure the efficiency of intensively accessing the data fields of a struct positioned at some array index, in situ--that is, "where they lie," without extracting or rewriting the entire struct (array element). Five different access methods are compared, with all other factors held the same.

The five methods are as follows:

Normal, direct array access via square-brackets and the field specifier dot. Note that, in .NET, arrays are a special and unique primitive of the Common Type System. As @Ani mentions above, this syntax cannot be used to change an individual field of a reference instance, such as a list, even when it is parameterized with a value-type.
Using the undocumented __makeref C# language keyword.
Managed pointer via a delegate which uses the ref keyword
"Unsafe" pointers
Same as #3, but using a C# function instead of a delegate.

Before I give the C# test results, here's the test harness implementation. These tests were run on .NET 4.5, an AnyCPU release build running on x64, Workstation gc. (Note that, because the test isn't interested the efficiency of allocating and de-allocating the array itself, the LOH consideration mentioned above does not apply.)

const int num_test = 100000;
static rec[] s1, s2, s3, s4, s5;
static long t_n, t_r, t_m, t_u, t_f;
static Stopwatch sw = Stopwatch.StartNew();
static Random rnd = new Random();

static void test2()
{
    s1 = new rec[num_test];
    s2 = new rec[num_test];
    s3 = new rec[num_test];
    s4 = new rec[num_test];
    s5 = new rec[num_test];

    for (int x, i = 0; i < 5000000; i++)
    {
        x = rnd.Next(num_test);
        test_m(x); test_n(x); test_r(x); test_u(x); test_f(x);
        x = rnd.Next(num_test);
        test_n(x); test_r(x); test_u(x); test_f(x); test_m(x);
        x = rnd.Next(num_test);
        test_r(x); test_u(x); test_f(x); test_m(x); test_n(x);
        x = rnd.Next(num_test);
        test_u(x); test_f(x); test_m(x); test_n(x); test_r(x);
        x = rnd.Next(num_test);
        test_f(x); test_m(x); test_n(x); test_r(x); test_u(x);
        x = rnd.Next(num_test);
    }
    Debug.Print("Normal (subscript+field):          {0,18}", t_n);
    Debug.Print("Typed-reference:                   {0,18}", t_r);
    Debug.Print("C# Managed pointer: (ref delegate) {0,18}", t_m);
    Debug.Print("C# Unsafe pointer:                 {0,18}", t_u);
    Debug.Print("C# Managed pointer: (ref func):    {0,18}", t_f);
}

Run Code Online (Sandbox Code Playgroud)

Because the code fragments which implement the test for each specific method are long-ish, I'll give the results first. Time is 'ticks;' lower means better.

Normal (subscript+field):             20,804,691
Typed-reference:                      30,920,655
Managed pointer: (ref delegate)       18,777,666   // <- a close 2nd
Unsafe pointer:                       22,395,806
Managed pointer: (ref func):          18,767,179   // <- winner

Run Code Online (Sandbox Code Playgroud)

I was surprised that these results were so unequivocal. TypedReferences are slowest, presumably because they lug around type information along with the pointer. Considering the heft of the IL-code for the belabored "Normal" version, it performed surprisingly well. Mode transitions seem to hurt unsafe code to the point where you really have to justify, plan, and measure each place you're going to deploy it.

But the hands down fastest times are achieved by leveraging the ref keyword in functions' parameter passing for the purpose of pointing to an interior part of the array, thus eliminating the "per-field-access" array indexing computation.

Perhaps the design of my test favors this one, but the test scenarios are representative of empirical use patterns in my app. What surprised my about those numbers is that the advantage of staying in managed mode--while having your pointers, too--was not cancelled by having to call a function or invoke through a delegate.

The Winner

Fastest one: (And perhaps simplest too?)

static void f(ref rec e)
{
    e.a = 4;
    e.e = e.a;
    e.b = e.d;
    e.f = e.d;
    e.b = e.e;
    e.a = e.c;
    e.b = 5;
    e.d = e.f;
    e.c = e.b;
    e.e = e.a;
    e.b = e.d;
    e.f = e.d;
    e.c = 6;
    e.b = e.e;
    e.a = e.c;
    e.d = e.f;
    e.c = e.b;
    e.e = e.a;
    e.d = 7;
    e.b = e.d;
    e.f = e.d;
    e.b = e.e;
    e.a = e.c;
    e.d = e.f;
    e.e = 8;
    e.c = e.b;
    e.e = e.a;
    e.b = e.d;
    e.f = e.d;
    e.b = e.e;
    e.f = 9;
    e.a = e.c;
    e.d = e.f;
    e.c = e.b;
    e.e = e.a;
    e.b = e.d;
    e.a = 10;
    e.f = e.d;
    e.b = e.e;
    e.a = e.c;
    e.d = e.f;
    e.c = e.b;
}
static void test_f(int ix)
{
    long q = sw.ElapsedTicks;
    f(ref s5[ix]);
    t_f += sw.ElapsedTicks - q;
}

Run Code Online (Sandbox Code Playgroud)

But it has the disadvantage that you can't keep related logic together in your program: the implementation of the function is divided across two C# functions, f and test_f.

We can address this particular problem with only a tiny sacrifice in performance. The next one is basically identical to the foregoing, but embeds one of the functions within the other as a lambda function...

A Close Second

Replacing the static function in the preceding example with an inline delegate requires the use of ref arguments, which in turn precludes the use of the Func<T> lambda syntax; instead you must use an explicit delegate from old-style .NET.

By adding this global declaration once:

delegate void b(ref rec ee);

Run Code Online (Sandbox Code Playgroud)

...we can use it throughout the program to directly ref into elements of array rec[], accessing them inline:

static void test_m(int ix)
{
    long q = sw.ElapsedTicks;
    /// the element to manipulate "e", is selected at the bottom of this lambda block
    ((b)((ref rec e) =>
    {
        e.a = 4;
        e.e = e.a;
        e.b = e.d;
        e.f = e.d;
        e.b = e.e;
        e.a = e.c;
        e.b = 5;
        e.d = e.f;
        e.c = e.b;
        e.e = e.a;
        e.b = e.d;
        e.f = e.d;
        e.c = 6;
        e.b = e.e;
        e.a = e.c;
        e.d = e.f;
        e.c = e.b;
        e.e = e.a;
        e.d = 7;
        e.b = e.d;
        e.f = e.d;
        e.b = e.e;
        e.a = e.c;
        e.d = e.f;
        e.e = 8;
        e.c = e.b;
        e.e = e.a;
        e.b = e.d;
        e.f = e.d;
        e.b = e.e;
        e.f = 9;
        e.a = e.c;
        e.d = e.f;
        e.c = e.b;
        e.e = e.a;
        e.b = e.d;
        e.a = 10;
        e.f = e.d;
        e.b = e.e;
        e.a = e.c;
        e.d = e.f;
        e.c = e.b;
    }))(ref s3[ix]);
    t_m += sw.ElapsedTicks - q;
}

Run Code Online (Sandbox Code Playgroud)

Also, although it may look like a new lambda function is being instantiated on each call, this won't happen if you're careful: when using this method, make sure you do not "close over" any local variables (that is, refer to variables which are outside the lambda function, from within its body), or do anything else that will bar your delegate instance from being static. If a local variable happens to fall into your lambda and the lambda thus gets promoted to an instance/class, you'll "probably" notice a difference as it tries to create five million delegates.

As long as you keep the lambda function clear of these side-effects, there won't be multiple instances; what's happening here is that, whenever C# determines that a lambda has no non-explicit dependencies, it lazily creates (and caches) a static singleton. It's a little unfortunate that a performance alternation this drastic is hidden from our view as a silent optimization. Overall, I like this method. It's fast and clutter-free--except for the bizarre parentheses, none of which can be omitted here.

And the rest

For completeness, here are the rest of the tests: normal bracketing-plus-dot; TypedReference; and unsafe pointers.

static void test_n(int ix)
{
    long q = sw.ElapsedTicks;
    s1[ix].a = 4;
    s1[ix].e = s1[ix].a;
    s1[ix].b = s1[ix].d;
    s1[ix].f = s1[ix].d;
    s1[ix].b = s1[ix].e;
    s1[ix].a = s1[ix].c;
    s1[ix].b = 5;
    s1[ix].d = s1[ix].f;
    s1[ix].c = s1[ix].b;
    s1[ix].e = s1[ix].a;
    s1[ix].b = s1[ix].d;
    s1[ix].f = s1[ix].d;
    s1[ix].c = 6;
    s1[ix].b = s1[ix].e;
    s1[ix].a = s1[ix].c;
    s1[ix].d = s1[ix].f;
    s1[ix].c = s1[ix].b;
    s1[ix].e = s1[ix].a;
    s1[ix].d = 7;
    s1[ix].b = s1[ix].d;
    s1[ix].f = s1[ix].d;
    s1[ix].b = s1[ix].e;
    s1[ix].a = s1[ix].c;
    s1[ix].d = s1[ix].f;
    s1[ix].e = 8;
    s1[ix].c = s1[ix].b;
    s1[ix].e = s1[ix].a;
    s1[ix].b = s1[ix].d;
    s1[ix].f = s1[ix].d;
    s1[ix].b = s1[ix].e;
    s1[ix].f = 9;
    s1[ix].a = s1[ix].c;
    s1[ix].d = s1[ix].f;
    s1[ix].c = s1[ix].b;
    s1[ix].e = s1[ix].a;
    s1[ix].b = s1[ix].d;
    s1[ix].a = 10;
    s1[ix].f = s1[ix].d;
    s1[ix].b = s1[ix].e;
    s1[ix].a = s1[ix].c;
    s1[ix].d = s1[ix].f;
    s1[ix].c = s1[ix].b;
    t_n += sw.ElapsedTicks - q;
}


static void test_r(int ix)
{
    long q = sw.ElapsedTicks;
    var tr = __makeref(s2[ix]);
    __refvalue(tr, rec).a = 4;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).b = 5;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).c = 6;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).d = 7;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).e = 8;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).f = 9;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    __refvalue(tr, rec).e = __refvalue( tr, rec).a;
    __refvalue(tr, rec).b = __refvalue( tr, rec).d;
    __refvalue(tr, rec).a = 10;
    __refvalue(tr, rec).f = __refvalue( tr, rec).d;
    __refvalue(tr, rec).b = __refvalue( tr, rec).e;
    __refvalue(tr, rec).a = __refvalue( tr, rec).c;
    __refvalue(tr, rec).d = __refvalue( tr, rec).f;
    __refvalue(tr, rec).c = __refvalue( tr, rec).b;
    t_r += sw.ElapsedTicks - q;
}

static void test_u(int ix)
{
    long q = sw.ElapsedTicks;

    fixed (rec* p = &s4[ix])
    {
        p->a = 4;
        p->e = p->a;
        p->b = p->d;
        p->f = p->d;
        p->b = p->e;
        p->a = p->c;
        p->b = 5;
        p->d = p->f;
        p->c = p->b;
        p->e = p->a;
        p->b = p->d;
        p->f = p->d;
        p->c = 6;
        p->b = p->e;
        p->a = p->c;
        p->d = p->f;
        p->c = p->b;
        p->e = p->a;
        p->d = 7;
        p->b = p->d;
        p->f = p->d;
        p->b = p->e;
        p->a = p->c;
        p->d = p->f;
        p->e = 8;
        p->c = p->b;
        p->e = p->a;
        p->b = p->d;
        p->f = p->d;
        p->b = p->e;
        p->f = 9;
        p->a = p->c;
        p->d = p->f;
        p->c = p->b;
        p->e = p->a;
        p->b = p->d;
        p->a = 10;
        p->f = p->d;
        p->b = p->e;
        p->a = p->c;
        p->d = p->f;
        p->c = p->b;
    }
    t_u += sw.ElapsedTicks - q;
}

Run Code Online (Sandbox Code Playgroud)

Summary

For memory-intensive work in large-scale C# apps, using managed pointers to directly access the fields of value-typed array elements in-situ is the way to go.

If you're really serious about performance, this might be enough reason to use C++/CLI (or CIL, for that matter) instead of C# for the relevant parts of your app, because those languages allow you to directly declare managed pointers within a function body.

In C#, the only way to create a managed pointer is to declare a function with a ref or out argument, and then the callee will observe the managed pointer. Thus, to get the performance benefits in C#, you have to use one of the (top two) methods shown above. [see C#7 below]

Sadly, these deploy the kludge of splitting a function into multiple parts just for the purpose of accessing an array element. Although considerably less elegant than the equivalent C++/CLI code would be, tests indicate that even in C#, for high-throughput applications we still obtain a big performance benefit versus naïve value-type array access.

[edit 2017: While perhaps conferring a small degree of prescience to this article's exhortations in general, the release of C# 7 in Visual Studio 2017 concomitantly renders the specific methods described above entirely obsolete. In short, the new ref locals feature in the language permits you to declare your own managed pointer as a local variable, and use it to consolidate the single array dereferencing operation. So given for example the test structure from above...

public struct rec { public int a, b, c, d, e, f; }
static rec[] s7 = new rec[100000];

Run Code Online (Sandbox Code Playgroud)

...here is how the same test function from above can now be written:

static void test_7(int ix)
{
    ref rec e = ref s7[ix];         // <---  C#7 ref local
    e.a = 4;  e.e = e.a; e.b = e.d; e.f = e.d; e.b = e.e; e.a = e.c;
    e.b = 5;  e.d = e.f; e.c = e.b; e.e = e.a; e.b = e.d; e.f = e.d;
    e.c = 6;  e.b = e.e; e.a = e.c; e.d = e.f; e.c = e.b; e.e = e.a;
    e.d = 7;  e.b = e.d; e.f = e.d; e.b = e.e; e.a = e.c; e.d = e.f;
    e.e = 8;  e.c = e.b; e.e = e.a; e.b = e.d; e.f = e.d; e.b = e.e;
    e.f = 9;  e.a = e.c; e.d = e.f; e.c = e.b; e.e = e.a; e.b = e.d;
    e.a = 10; e.f = e.d; e.b = e.e; e.a = e.c; e.d = e.f; e.c = e.b;
}

Run Code Online (Sandbox Code Playgroud)

Notice how this completely eliminates the need for kludges such as those I discussed above. The sleeker use of a managed pointer avoids the unnecessary function call that was used in "the winner," the best-performing methodology of those I reviewed. Therefore, the performance with the new feature can only be better than the winner of methods compared above.

Ironically enough, C# 7 also adds local functions, a feature which would directly solve the complaint about poor encapsulation I raised for two of the aforementioned hacks. Happily enough the whole enterprise of proliferating dedicated functions just for the purpose of gaining access to managed pointers is now completely moot.

Answer 2

Jon*_*eet 10

唯一的问题是你试图从静态方法调用实例方法,而没有实例P.

做f一个静态方法(或创建的实例P上调用它),它会没事的.这都是关于阅读编译器错误:)

话虽如此,我强烈建议你:

尽可能避免创造大块结构
尽可能避免创建可变结构
避免公共领域

归档时间：	14 年前
查看次数：	12301 次
最近记录：	6 年，6 月前