glibc中重复内存分配的效率

qua*_*dev 1 c performance memory-management glibc

下面是来自着名的LAPACK数值库的Fortran ZHEEVR例程的C包装器:

void zheevr(char jobz, char range, char uplo, int n, doublecomplex* a, int lda, double vl, double vu, int il, int iu, double abstol, double* w, doublecomplex* z, int ldz, int* info)
{
    int m;
    int lwork = -1;
    int liwork = -1;
    int lrwork = -1;
    int* isuppz = alloc_memory(sizeof(int) * 2 * n);
    zheevr_(&jobz, &range, &uplo, &n, a, &lda, &vl, &vu, &il, &iu, &abstol, &m, w, z, &ldz, isuppz, small_work_doublecomplex, &lwork, small_work_double, &lrwork, small_work_int, &liwork, &info);
    lwork = (int) small_work_doublecomplex[0].real;
    liwork = small_work_int[0];
    lrwork = (int) small_work_double[0];
    doublecomplex* work = alloc_memory(sizeof(doublecomplex) * lwork);
    double* rwork = alloc_memory(sizeof(double) * lrwork);
    int* iwork = alloc_memory(sizeof(int) * liwork);
    zheevr_(&jobz, &range, &uplo, &n, a, &lda, &vl, &vu, &il, &iu, &abstol, &m, w, z, &ldz, isuppz, work, &lwork, rwork, &lrwork, iwork, &liwork, info);
    free(iwork);
    free(rwork);
    free(work);
    free(isuppz);
}
Run Code Online (Sandbox Code Playgroud)

在我的应用程序中,此函数被称为数十万次,以对复杂矩阵"a"(参数名称遵循此函数的Fortran约定)进行对角化,以获得相同的矩阵大小.我认为工作数组大小在大多数时候都是相同的,因为对角化矩阵将具有相同的结构.我的问题是:

  1. 可以重复的alloc/free("alloc_memory"是一个围绕glibc的malloc的简单包装)调用会损害性能,有多糟糕?
  2. 免费的顺序是否重要?我应该先释放最后一个分配的数组,还是最后一个?

Eug*_*jak 5

1)是的,他们可以.

2)任何理智的libc都不应该担心free()的顺序.表现明智,这也不重要.

我建议从此函数中删除内存管理 - 因此调用者将提供矩阵大小和分配的临时缓冲区.如果从相同大小的矩阵上的相同位置调用此函数,那将显着减少malloc的数量.


Jon*_*ler 5

  • 你能用C99吗?(答案:是的,您已经在使用C99表示法 - 在需要时声明变量.)
  • 阵列的大小是否合理(不是太大)?

如果两个答案都是"是",请考虑使用VLA - 可变长度数组:

void zheevr(char jobz, char range, char uplo, int n, doublecomplex* a, int lda, double vl, double vu, int il, int iu, double abstol, double* w, doublecomplex* z, int ldz, int* info)
{
    int m;
    int lwork = -1;
    int liwork = -1;
    int lrwork = -1;
    int isuppz[2*n];
    zheevr_(&jobz, &range, &uplo, &n, a, &lda, &vl, &vu, &il, &iu, &abstol, &m, w, z, &ldz, isuppz, small_work_doublecomplex, &lwork, small_work_double, &lrwork, small_work_int, &liwork, &info);
    lwork = (int) small_work_doublecomplex[0].real;
    liwork = small_work_int[0];
    lrwork = (int) small_work_double[0];
    doublecomplex work[lwork];
    double rwork[lrwork];
    int iwork[liwork];
    zheevr_(&jobz, &range, &uplo, &n, a, &lda, &vl, &vu, &il, &iu, &abstol, &m, w, z, &ldz, isuppz, work, &lwork, rwork, &lrwork, iwork, &liwork, info);
}
Run Code Online (Sandbox Code Playgroud)

使用VLA的一个好处是,您无需自由完成任务.

(未经测试的代码!)