c:调试模糊内存泄漏的策略?

jas*_*cks 2 c memory memory-leaks dangling-pointer

我正在用 c 开发一个项目,我试图了解如何调试一个使我的程序崩溃的模糊错误。它有点大,通过制作较小版本的代码来隔离问题的尝试是行不通的。所以我试图想出一种方法来调试和查明内存泄漏。

我想出了以下计划:我知道问题来自运行某个函数,并且该函数递归调用自身。所以我想我可以为我的程序内存分配做一个快照。由于我不知道 jack 引擎盖下会发生什么(我知道在这种情况下还不够有用):

typedef struct record_mem {
    int num_allocs;
    int num_frees;
    int size_space;
    int num_structure_1;
    ...
    int num_structure_N;
    int num_records;
    struct record_mem *next;
} RECORD;
extern RECORD *top;
void pushmem(RECORD **top)
{
    RECORD *nnew = 0;
    RECORD *nnew = (RECORD *)malloc(sizeof(RECORD));
    nnew->num_allocs=1;
    nnew->num_frees=0;
    nnew->size_space=sizeof(RECORD);
    nnew->num_structure_1=0;
    ...
    nnew->num_structure_N=0;
    nnew->num_records=1;
    nnew->next=0;
    if(*top)
    {
        nnew->num_allocs+=(*top)->num_allocs;
        nnew->num_frees=(*top)->num_frees;
        nnew->size_space+=(*top)->size_space;
            nnew->num_structure_1=(*top)->num_allocs;
            ...
            nnew->num_structure_N=(*top)->num_allocs;
            nnew->num_records+=(*top)->num_records;
        nnew->next=*top;
    }
    *top=nnew;
}
Run Code Online (Sandbox Code Playgroud)

我的想法是在我的程序崩溃之前打印出我的内存记录的内容(由于 GDB,我知道它在哪里崩溃)。

然后在整个程序中(对于我程序中的每个数据结构,我都有一个类似上面的推送函数)我可以简单地添加一个带有计算数据结构分配加上总堆栈(堆?)内存分配(我可以跟踪的)。我只是在我觉得需要记录我的程序运行的快照的地方创建更多的 memory_record 结构。问题是,如果我无法以某种方式记录实际使用了多少内存,则此内存资产负债表记录将无济于事。

但是我该怎么做呢?另外,我将如何考虑悬空指针和泄漏?我正在使用 OS X,我目前正在查找如何记录堆栈指针和其他内容。

编辑:既然你问:valgrind的输出:(closure()是从main调用的返回错误指针的函数:它应该返回一个双向链表的头部,traversehashmap()是一个从closure()调用的函数我用来计算额外的节点并将其附加到链表中,它递归地调用自己,因为它需要在节点之间跳转。)

jason-danckss-macbook:project Jason$ valgrind --leak-check=full --tool=memcheck ./testc
Will attempt to compute closure of AB:
Result: testcl: 0x10000d0b0
==7682== Invalid read of size 8
==7682==    at 0x100001D4E: printrelation2 (relation.h:490)
==7682==    by 0x100003CFE: main (test-computation.c:47)
==7682==  Address 0x10000cee8 is 8 bytes inside a block of size 24 free'd
==7682==    at 0xD828: free (vg_replace_malloc.c:450)
==7682==    by 0x100001232: destroyrelation2 (relation.h:161)
==7682==    by 0x100003407: destroyallhashmap (computation.h:333)
==7682==    by 0x1000039E1: closure (computation.h:539)
==7682==    by 0x100003CBE: main (test-computation.c:38)
==7682== 
==7682== 
==7682== HEAP SUMMARY:
==7682==     in use at exit: 5,360 bytes in 48 blocks
==7682==   total heap usage: 99 allocs, 51 frees, 6,640 bytes allocated
==7682== 
==7682== 48 (24 direct, 24 indirect) bytes in 1 blocks are definitely lost in loss record 33 of 37
==7682==    at 0xC283: malloc (vg_replace_malloc.c:274)
==7682==    by 0x100001104: getnewrelation (relation.h:134)
==7682==    by 0x100001848: copyrelation (relation.h:343)
==7682==    by 0x100003991: closure (computation.h:531)
==7682==    by 0x100003CBE: main (test-computation.c:38)
==7682== 
==7682== 1,128 (24 direct, 1,104 indirect) bytes in 1 blocks are definitely lost in loss record 36 of 37
==7682==    at 0xC283: malloc (vg_replace_malloc.c:274)
==7682==    by 0x100002315: getnewholder (dependency.h:129)
==7682==    by 0x100003B17: main (test-computation.c:14)
==7682== 
==7682== LEAK SUMMARY:
==7682==    definitely lost: 48 bytes in 2 blocks
==7682==    indirectly lost: 1,128 bytes in 44 blocks
==7682==      possibly lost: 0 bytes in 0 blocks
==7682==    still reachable: 4,096 bytes in 1 blocks
==7682==         suppressed: 88 bytes in 1 blocks
==7682== Reachable blocks (those to which a pointer was found) are not shown.
==7682== To see them, rerun with: --leak-check=full --show-reachable=yes
==7682== 
==7682== For counts of detected and suppressed errors, rerun with: -v
==7682== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
Run Code Online (Sandbox Code Playgroud)

7he*_*.tk 5

您是否尝试过valgrind(及其memcheck)?

$ valgrind --tool=memcheck --leak-check=full ./yourprogram
Run Code Online (Sandbox Code Playgroud)

(最好用 编译你的程序-g

编辑:抱歉,我没有读到您不想使用 Valgrind,但是正如dureuill在您的帖子中的评论中指出的那样,它非常有用,值得花时间学习。

另一个信息:内存泄漏是由free某些malloc或之后丢失引起的realloc(您可以在此处查看C 中的简单示例)。您还可以使用grep(with -nto get the line and -rfor a recursive search) 来列出程序中的所有内存分配行;并尝试将它们中的每一个与对 的调用相匹配free。然而,这可能很乏味,我真的相信使用 Valgrind 会更快。


dur*_*ill 5

从你的valgrind输出:

这可能是导致您的问题的原因:

==7682== Invalid read of size 8
==7682==    at 0x100001D4E: printrelation2 (relation.h:490)
==7682==    by 0x100003CFE: main (test-computation.c:47)
==7682==  Address 0x10000cee8 is 8 bytes inside a block of size 24 free'd
==7682==    at 0xD828: free (vg_replace_malloc.c:450)
==7682==    by 0x100001232: destroyrelation2 (relation.h:161)
==7682==    by 0x100003407: destroyallhashmap (computation.h:333)
==7682==    by 0x1000039E1: closure (computation.h:539)
==7682==    by 0x100003CBE: main (test-computation.c:38)
Run Code Online (Sandbox Code Playgroud)

让我们深入了解一下

==7682== Invalid read of size 8
==7682==    at 0x100001D4E: printrelation2 (relation.h:490)
==7682==    by 0x100003CFE: main (test-computation.c:47)
Run Code Online (Sandbox Code Playgroud)

这是您的错误的摘要。您可以在关系.h 的第 490 行处访问​​ 8 个字节的未分配(或先前分配然后释放)的内存位置printrelation2

==7682==  Address 0x10000cee8 is 8 bytes inside a block of size 24 free'd
Run Code Online (Sandbox Code Playgroud)

访问的地址在大小为 24 的块内长度为 8 字节,即可能是大小为 24 的结构中的大小为 8 字节的字段(查找这样的结构),并且您之前释放了该地址。

==7682==    at 0xD828: free (vg_replace_malloc.c:450)
==7682==    by 0x100001232: destroyrelation2 (relation.h:161)
==7682==    by 0x100003407: destroyallhashmap (computation.h:333)
==7682==    by 0x1000039E1: closure (computation.h:539)
==7682==    by 0x100003CBE: main (test-computation.c:38)
Run Code Online (Sandbox Code Playgroud)

这是导致释放您在程序崩溃时引用的地址的调用堆栈。它以 free 开头,这是正常的,因为您可能使用该free函数来释放内存。然而文件和行是标准库,所以不是很相关。但相关的是,这个 free 是从destroyrelation2relation.h 中的第 161 行调用的,这是错误的 free。destroyrelation2它本身被调用destroyallhashmap,又被调用,又被test-computation.c 的第 38 行closure调用。main您需要找出分配中的什么错误导致您重用了先前在 main 中第 38 行释放的 printrelation2 中的指针。

后来报告的内存泄漏确实存在,但不太可能导致崩溃。

valgrind 的输出现在更清晰了吗?

注 1:修复段错误后,此内存泄漏报告可能会发生变化,但就目前情况而言,我的解释如下:

==7682== 48 (24 direct, 24 indirect) bytes in 1 blocks are definitely lost in loss record 33 of 37
==7682==    at 0xC283: malloc (vg_replace_malloc.c:274)
==7682==    by 0x100001104: getnewrelation (relation.h:134)
==7682==    by 0x100001848: copyrelation (relation.h:343)
==7682==    by 0x100003991: closure (computation.h:531)
==7682==    by 0x100003CBE: main (test-computation.c:38)
==7682== 
==7682== 1,128 (24 direct, 1,104 indirect) bytes in 1 blocks are definitely lost in loss record 36 of 37
==7682==    at 0xC283: malloc (vg_replace_malloc.c:274)
==7682==    by 0x100002315: getnewholder (dependency.h:129)
==7682==    by 0x100003B17: main (test-computation.c:14)
==7682== 
==7682== LEAK SUMMARY:
==7682==    definitely lost: 48 bytes in 2 blocks
==7682==    indirectly lost: 1,128 bytes in 44 blocks
==7682==      possibly lost: 0 bytes in 0 blocks
==7682==    still reachable: 4,096 bytes in 1 blocks
==7682==         suppressed: 88 bytes in 1 blocks
Run Code Online (Sandbox Code Playgroud)

我们先来总结一下:

==7682== LEAK SUMMARY:
==7682==    definitely lost: 48 bytes in 2 blocks
==7682==    indirectly lost: 1,128 bytes in 44 blocks
==7682==      possibly lost: 0 bytes in 0 blocks
==7682==    still reachable: 4,096 bytes in 1 blocks
==7682==         suppressed: 88 bytes in 1 blocks
Run Code Online (Sandbox Code Playgroud)

您有两个分配的内存块,无法通过任何指针访问。这意味着在程序的某个地方,您对它们进行了 malloc,然后在稍后的某个时刻您完全忘记了它们。这些都是严重的内存泄漏。您需要检查您的逻辑,以便保留这些块的句柄,或者在程序生命周期中尽早释放它们。我不确定间接丢失,我想说你没有块的直接句柄,但你有指向拥有块句柄的结构的指针。可以通过在退出之前释放结构中的指针来减轻这些内存泄漏。我不知道“可能丢失”,也从未使用过 valgrind。“仍然可达”是好的内存泄漏,即在 valgrind 崩溃时,您没有释放仍然可达的块,但您有一个指向它的指针,您可以轻松添加调用来释放该指针并解决内存泄漏。

这两个调用堆栈向您显示导致内存泄漏的 malloc,减去“仍然可达”泄漏(要查看它们,您必须将选项添加--leak-check-full --show-reachable=yes到 valgrind 调用中。

注意 2:避免使用 destroyallhashmap(难以阅读)或 destroyrelation2(编号)等函数名称。更喜欢 destroy_all_hashmap 或不太常见的(C 语言) destroyAllHashmap 并避免对函数进行编号。同样,避免使用像 nnew 这样的变量名称,但使用语义上合理的变量名称。