我对STREAM(http://www.cs.virginia.edu/stream/ref.html#runrules)基准测试有一些疑问。
* (a) Each array must be at least 4 times the size of the
* available cache memory. I don't worry about the difference
* between 10^6 and 2^20, so in practice the minimum array size
* is about 3.8 times the cache size.
Run Code Online (Sandbox Code Playgroud)
例如,我添加了两个额外的数组,并确保将它们与原始a / b / c数组一起访问。我相应地修改了字节记帐。使用这两个额外的阵列,我的带宽数量增加了约11.5%。
> diff stream.c modified_stream.c
181c181,183
< c[STREAM_ARRAY_SIZE+OFFSET];
---
> c[STREAM_ARRAY_SIZE+OFFSET],
> e[STREAM_ARRAY_SIZE+OFFSET],
> d[STREAM_ARRAY_SIZE+OFFSET];
192,193c194,195
< 3 * sizeof(STREAM_TYPE) * STREAM_ARRAY_SIZE,
< 3 * sizeof(STREAM_TYPE) * …Run Code Online (Sandbox Code Playgroud) benchmarking cpu-architecture microbenchmark memory-bandwidth