C strtok正在创建一个无法读取大小1无法释放(令牌)

DCR*_*DCR 1 c free valgrind strtok

一直试图调试这个简单的c程序:

#include <stdbool.h>
#include <stdlib.h>
#include <stdio.h>
#include <string.h>

#define MAX_WORD_SIZE 60
int wordCnt = 0;

int main(void){

//open dictionary 
FILE *ptr = fopen("large", "r");
if(ptr == NULL){
  printf("unable to open %s","large");
}

//get file size 
int fileSize;
fseek(ptr, 0 , SEEK_END);
fileSize=ftell(ptr) ;

//get memory for file buffer (read in whole file at once, faster) 
char * buffer = malloc(sizeof(char)*fileSize);

//rewind and read in file
fseek(ptr, 0 , SEEK_SET);
fread(buffer, fileSize, 1, ptr);

//get memory for longest word
char * token = malloc(sizeof(char)*MAX_WORD_SIZE);
Run Code Online (Sandbox Code Playgroud)

//这是导致问题的部分

while (token != NULL)
{
    if(wordCnt == 0)token = strtok(buffer, "\r\n");
    else token = strtok(NULL, "\r\n");

    wordCnt++;
}
wordCnt--;    

fclose(ptr);
free(token);
free(buffer);
}
Run Code Online (Sandbox Code Playgroud)

以下是来自valgrind的错误消息:

valgrind ./test
==16233== Memcheck, a memory error detector
==16233== Copyright (C) 2002-2013, and GNU GPL'd, by Julian Seward et al.
==16233== Using Valgrind-3.10.1 and LibVEX; rerun with -h for copyright info
==16233== Command: ./test
==16233== 
==16233== Invalid read of size 1
==16233==    at 0x5E4496C: strtok (strtok.S:137)
==16233==    by 0x42D848: main (test.c:43)
==16233==  Address 0x62dd8bc is 0 bytes after a block of size 1,439,228 alloc'd
==16233==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16233==    by 0x42D728: main (test.c:29)
==16233== 
==16233== Invalid read of size 1
==16233==    at 0x5E4499C: strtok (strtok.S:163)
==16233==    by 0x42D848: main (test.c:43)
==16233==  Address 0x62dd8bc is 0 bytes after a block of size 1,439,228 alloc'd
==16233==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16233==    by 0x42D728: main (test.c:29)
==16233== 
==16233== 
==16233== HEAP SUMMARY:
==16233==     in use at exit: 60 bytes in 1 blocks
==16233==   total heap usage: 3 allocs, 2 frees, 1,439,856 bytes allocated
==16233== 
==16233== 60 bytes in 1 blocks are definitely lost in loss record 1 of 1
==16233==    at 0x4C2AB80: malloc (in /usr/lib/valgrind/vgpreload_memcheck-amd64-linux.so)
==16233==    by 0x42D7B1: main (test.c:36)
==16233== 
==16233== LEAK SUMMARY:
==16233==    definitely lost: 60 bytes in 1 blocks
==16233==    indirectly lost: 0 bytes in 0 blocks
==16233==      possibly lost: 0 bytes in 0 blocks
==16233==    still reachable: 0 bytes in 0 blocks
==16233==         suppressed: 0 bytes in 0 blocks
==16233== 
==16233== For counts of detected and suppressed errors, rerun with: -v
==16233== ERROR SUMMARY: 3 errors from 3 contexts (suppressed: 0 from 0)
Run Code Online (Sandbox Code Playgroud)

Som*_*ude 6

strtok函数在初始调用提供的缓冲区中间返回一个指针,不应该free使用该指针调用.

并且报告的内存泄漏是因为您分配内存并且token 最初指向该内存.然后在标记化循环中,token指向内部的内存buffer.

一个典型的循环使用strtok就像是

char *token = strtok(buffer, "\r\n");
while (token != NULL)
{
    ++wordCnt;
    token = strtok(NULL, "\r\n");
}
Run Code Online (Sandbox Code Playgroud)

让我们说缓冲区包含字符串"Hello\nWorld".

在内存中它看起来像

+--------+     +---+---+---+---+---+----+---+---+---+---+---+----+
| buffer | --> | H | e | l | l | o | \n | W | o | r | l | d | \0 |
+--------+     +---+---+---+---+---+----+---+---+---+---+---+----+

你做完之后

char *token = strtok(buffer, "\r\n");
Run Code Online (Sandbox Code Playgroud)

那你有类似的东西

+--------+     +---+---+---+---+---+----+---+---+---+---+---+----+
| buffer | --> | H | e | l | l | o | \n | W | o | r | l | d | \0 |
+--------+     +---+---+---+---+---+----+---+---+---+---+---+----+
                                        ^
+-------+                               |
| token | ------------------------------/
+-------+

也就是说,token指向换行符之后的位置("单词"的开头"World"),但它位于分配给的内存中buffer.


还有另一个问题:strtok函数需要将您标记为的字符串作为实际的空终止字符串.你的不是,这就是导致"无效读取"错误的原因,因为strtok超出了分配的内存范围buffer.

您需要再分配一个字节buffer并初始化最后一个字节'\0'以使其终止:

char * buffer = malloc(fileSize + 1);  // +1 for string terminator

// Read...

buffer[fileSize] = '\0';  // Terminate strings
Run Code Online (Sandbox Code Playgroud)

请注意,我没有乘以sizeof(char)因为它在规范中定义为始终等于1.