C - 使用strcmp进行分段错误?

rth*_*sen 3 c hashtable segmentation-fault

我似乎在strcmp函数的某处出现了分段错误.我还是C的新手,我不明白为什么它会给我错误.

int linear_probe(htable h, char *item, int k){
  int p;
  int step = 1;
  do {
    p = (k + step++) % h->capacity;
  }while(h->keys[p] != NULL && strcmp(h->keys[p], item) != 0);
  return p;
}
Run Code Online (Sandbox Code Playgroud)

GDB:

Program received signal SIGSEGV, Segmentation fault.
0x0000003a8e331856 in __strcmp_ssse3 () from /lib64/libc.so.6

(gdb) frame 1
#1  0x0000000000400ea6 in linear_probe (h=0x603010, item=0x7fffffffde00 "ksjojf", k=-1122175319) at htable.c:52
Run Code Online (Sandbox Code Playgroud)

编辑:插入代码和htable结构

int htable_insert(htable h, char *item){
  unsigned int k = htable_word_to_int(item);
  int p = k % h->capacity;

  if(NULL == h->keys[p]){
    h->keys[p] = (char *)malloc(strlen(item)+1);
    strcpy(h->keys[p], item);
    h->freqs[p] = 1;
    h->num_keys++;
    return 1;
  }

  if(strcmp(h->keys[p], item) == 0){
    return ++h->freqs[p];
  }

  if(h->num_keys == h->capacity){
    return 0;
  }

  if(h->method == LINEAR_P) p = linear_probe(h, item, k);
  else p = double_hash(h, item, k);

  if(NULL == h->keys[p]){
    h->keys[p] = (char *)malloc(strlen(item)+1);
    strcpy(h->keys[p], item);
    h->freqs[p] = 1;
    h->num_keys++;
    return 1;
  }else if(strcmp(h->keys[p], item) == 0){
    return ++h->freqs[p]; 
  }
  return 0;
}
Run Code Online (Sandbox Code Playgroud)
  struct htablerec{
      int num_keys;
      int capacity;
      int *stats;
      char **keys;
      int *freqs;
      hashing_t method;
    };
Run Code Online (Sandbox Code Playgroud)

谢谢

编辑:valgrind - 我输入随机值添加到表

sdkgj
fgijdfh
dfkgjgg
jdf
kdjfg
==25643== Conditional jump or move depends on uninitialised value(s)
==25643==    at 0x40107E: htable_insert (htable.c:87)
==25643==    by 0x400AB7: main (main.c:75)
==25643== 
fdkjb
kjdfg
kdfg
nfdg
lkdfg
oijfd
kjsf
vmf
kjdf
kjsfg
fjgd
fgkjfg
==25643== Invalid read of size 8
==25643==    at 0x400E0E: linear_probe (htable.c:51)
==25643==    by 0x401095: htable_insert (htable.c:87)
==25643==    by 0x400AB7: main (main.c:75)
==25643==  Address 0x4c342a0 is not stack'd, malloc'd or (recently) free'd
==25643== 
==25643== Invalid read of size 8
==25643==    at 0x400E2B: linear_probe (htable.c:51)
==25643==    by 0x401095: htable_insert (htable.c:87)
==25643==    by 0x400AB7: main (main.c:75)
==25643==  Address 0x4c342a0 is not stack'd, malloc'd or (recently) free'd
==25643== 
==25643== Invalid read of size 1
==25643==    at 0x4A06C51: strcmp (mc_replace_strmem.c:426)
==25643==    by 0x400E3C: linear_probe (htable.c:51)
==25643==    by 0x401095: htable_insert (htable.c:87)
==25643==    by 0x400AB7: main (main.c:75)
==25643==  Address 0x210 is not stack'd, malloc'd or (recently) free'd
==25643== 
==25643== 
==25643== Process terminating with default action of signal 11 (SIGSEGV)
==25643==  Access not within mapped region at address 0x210
==25643==    at 0x4A06C51: strcmp (mc_replace_strmem.c:426)
==25643==    by 0x400E3C: linear_probe (htable.c:51)
==25643==    by 0x401095: htable_insert (htable.c:87)
==25643==    by 0x400AB7: main (main.c:75)
==25643==  If you believe this happened as a result of a stack
==25643==  overflow in your program's main thread (unlikely but
==25643==  possible), you can try to increase the size of the
==25643==  main thread stack using the --main-stacksize= flag.
==25643==  The main thread stack size used in this run was 8388608.
==25643== 
==25643== HEAP SUMMARY:
==25643==     in use at exit: 1,982 bytes in 28 blocks
==25643==   total heap usage: 28 allocs, 0 frees, 1,982 bytes allocated
==25643== 
==25643== LEAK SUMMARY:
==25643==    definitely lost: 0 bytes in 0 blocks
==25643==    indirectly lost: 0 bytes in 0 blocks
==25643==      possibly lost: 0 bytes in 0 blocks
==25643==    still reachable: 1,982 bytes in 28 blocks
==25643==         suppressed: 0 bytes in 0 blocks
==25643== Rerun with --leak-check=full to see details of leaked memory
==25643== 
==25643== For counts of detected and suppressed errors, rerun with: -v
==25643== Use --track-origins=yes to see where uninitialised values come from
==25643== ERROR SUMMARY: 7 errors from 4 contexts (suppressed: 6 from 6)
Segmentation fault (core dumped)
Run Code Online (Sandbox Code Playgroud)
static unsigned int htable_word_to_int(char *word){
  unsigned int result = 0;
  while(*word != '\0'){
    result = (*word++ + 31 * result);
  }
  return result;
}
Run Code Online (Sandbox Code Playgroud)

pax*_*blo 5

除了你的值htable可能是无效指针的可能性(即,既不是NULL也不是指向正确的C字符串的指针),如果它既不包含NULL也不包含字符串,那么遇到无限循环会遇到严重问题.寻找.

对于直接问题,请尝试将代码更改为:

#define FLUSH fflush (stdout); fsync (fileno (stdout))

int linear_probe (htable h, char *item, int k) {
    int pos = k;
    do {
        pos = (pos + 1) % h->capacity;
        printf ("========\n");                    FLUSH;
        printf ("inpk: %d\n",   k);               FLUSH;
        printf ("posn: %d\n",   pos);             FLUSH;
        printf ("cpct: %d\n",   h->capacity);     FLUSH;
        printf ("keyp: %p\n",   h->keys[pos]);    FLUSH;
        printf ("keys: '%s'\n", h->keys[pos]);    FLUSH;
        printf ("item: '%s'\n", item);            FLUSH;
        printf ("========\n");                    FLUSH;
    } while ((pos != k)
          && (h->keys[pos] != NULL)
          && (strcmp (h->keys[pos], item) != 0));
    return pos;
}
Run Code Online (Sandbox Code Playgroud)

那些调试语句应该可以指示出现了什么问题.


既然你得到了:

inpk: -2055051140
posn: -30
cpct: 113
keyp: 0x100000001
Run Code Online (Sandbox Code Playgroud)

在崩溃之前,显然有人正在传递虚假价值k.负数的模运算是在C标准中定义的实现,所以你也得到负值pos.由于h->pos[-30]将是未定义的行为,所有赌注都将被取消.

找到并修复传递伪造值的代码(可能是未初始化的变量)或通过更改以下内容来保护您的函数:

int pos = k;
Run Code Online (Sandbox Code Playgroud)

成:

int pos;
if ((k < 0) || (k >= h->capacity))
    k = 0;
pos = k;
Run Code Online (Sandbox Code Playgroud)

在你的功能开始.我实际上都做了两件事然后我很偏执:-)


而且,基于另一个更新(哈希键计算,如果你生成一个unsigned int然后盲目地使用它作为签名int,你很有可能得到负值:

#include <stdio.h>

int main (void) {
    unsigned int x = 0xffff0000U;
    int y = x;
    printf ("%u %d\n", x, y);
    return(0);
}
Run Code Online (Sandbox Code Playgroud)

这输出:

4294901760 -65536
Run Code Online (Sandbox Code Playgroud)

我的建议是使用无符号整数表示明确无符号的值.