Java多线程提供非常小的性能增益

Som*_*ium 5 java parallel-processing multithreading

我想学习并行编程以加速算法并选择Java.
我写了两个函数来对long数组中的整数求和- 一个简单的迭代遍历数组,第二个 - 将数组分成几个部分,并在分离的线程中汇总部分.

我预计使用两个线程的速度大约是2倍.但是,我得到的只是加速了24%.而且,使用更多线程,我在两个线程上没有任何改进(可能少于1%).我知道应该有线程创建/加入开销,但我想它不应该那么大.

你能解释一下,我错过了什么或代码中的错误在哪里?这是代码:

import java.util.concurrent.ThreadLocalRandom;


public class ParallelTest {


public static long sum1 (long[] num, int a, int b) {
    long r = 0;
    while (a < b) {
        r += num[a];
        ++a;
    }
    return r;
}

public static class SumThread extends Thread {
    private long num[];
    private long r;
    private int a, b;

    public SumThread (long[] num, int a, int b) {
        super();
        this.num = num;
        this.a = a;
        this.b = b;
    }

    @Override
    public void run () {
        r = ParallelTest.sum1(num, a, b);
    }

    public long getSum () {
        return r;
    }
}


public static long sum2 (long[] num, int a, int b, int threadCnt) throws InterruptedException {
    SumThread[] th = new SumThread[threadCnt];
    int i = 0, c = (b - a + threadCnt - 1) / threadCnt;

    for (;;) {
        int a2 = a + c;
        if (a2 > b) {
            a2 = b;
        }
        th[i] = new SumThread(num, a, a2);
        th[i].start();
        if (a2 == b) {
            break;
        }
        a = a2;
        ++i;
    }

    for (i = 0; i < threadCnt; ++i) {
        th[i].join();
    }
    long r = 0;
    for (i = 0; i < threadCnt; ++i) {
        r += th[i].getSum();
    }
    return r;
}

public static void main(String[] args) throws InterruptedException {
    final int N = 230000000;
    long[] num = new long[N];

    for (int i = 0; i < N; ++i) {
        num[i] = ThreadLocalRandom.current().nextLong(1, 9999);
    }

    // System.out.println(Runtime.getRuntime().availableProcessors());

    long timestamp = System.nanoTime();
    System.out.println(sum1(num, 0, num.length));
    System.out.println(System.nanoTime() - timestamp);

    for (int n = 2; n <= 4; ++n) {
        timestamp = System.nanoTime();
        System.out.println(sum2(num, 0, num.length, n));
        System.out.println(System.nanoTime() - timestamp);
    }


}
}
Run Code Online (Sandbox Code Playgroud)

编辑:我有i7处理器4核(8线程).代码给出的输出是:

1149914787860
175689196
1149914787860
149224086
1149914787860
147709988
1149914787860
138243999
Run Code Online (Sandbox Code Playgroud)

rcg*_*ldr 3

该程序的主内存带宽可能仅限于两个线程,因为它是一个小循环,其获取数据的速度几乎与内存向处理器提供数据的速度一样快。