为什么python实现使用的内存是C的9倍？

Question

为什么python实现使用的内存是C的9倍？

我编写了一个程序，在python和C中编写了一个从2到给定数字的素数列表。我运行了两个程序，以寻找素数达到相同数的素数，并在活动监视器中查看了它们各自的过程。我发现python实现使用的内存恰好是C实现的9倍。为什么python需要这么多的内存，为什么那个特定的倍数存储相同的整数数组？这是该程序的两种实现：

Python版本：

import math
import sys

top = int(input('Please enter the highest number you would like to have checked: '))
num = 3
prime_list = [2]
while num <= top:
    n = 0
    prime = True
    while int(prime_list[n]) <= math.sqrt(num):
        if num % prime_list[n] == 0:
            prime = False
            n = 0
            break
        n = n + 1
    if prime == True:
        prime_list.append(num)
        prime = False
    num = num + 1
print("I found ", len(prime_list), " primes")
print("The largest prime I found was ", prime_list[-1])

Run Code Online (Sandbox Code Playgroud)

C版：

#include <stdio.h>
#include <stdlib.h>
#include <math.h>
#include <sys/types.h>
#include <unistd.h>

int main(){
    int N;
    int arraySize = 1;
    int *primes = malloc(100*sizeof(int));
    int isPrime = 1;
    primes[0] = 2;
    int timesRealloc = 0;
    int availableSlots = 100;

    printf("Please enter the largest number you want checked: \n");
    scanf("%d", &N);

    int j = 0;
    int i;
    for (i = 3; i <= N; i+=2){
        j = 0;
        isPrime = 1;
        while (primes[j] <= sqrt(i)) {
            if (i%primes[j] == 0) {
                isPrime = 0;
                break;
            }
            j++;
        }
        if (isPrime == 1){
            primes[arraySize] = i;
            arraySize++;
        }
        if (availableSlots == arraySize){
            timesRealloc++;
            availableSlots += 100;
            primes = realloc(primes, availableSlots*sizeof(int));
        }
    }

    printf("I found %d primes\n", arraySize);
    printf("Memory was reallocated %d times\n", timesRealloc);
    printf("The largest prime I found was %d\n", primes[(arraySize-1)]);


    return 0;
}

Run Code Online (Sandbox Code Playgroud)

Answer 1

Ant*_*ala 5

>>> import sys
>>> sys.getsizeof(123456)
28

Run Code Online (Sandbox Code Playgroud)

那是C大小的7倍int。在Python 3中，整数是struct _longobjectaka的实例PyLong：

struct _longobject {
    PyVarObject ob_base;
    digit ob_digit[1];
};

Run Code Online (Sandbox Code Playgroud)

这里PyVarObject是

typedef struct {
    PyObject ob_base;
    Py_ssize_t ob_size;
} PyVarObject;

Run Code Online (Sandbox Code Playgroud)

并且PyObject是

typedef struct _object {
    Py_ssize_t ob_refcnt;
    struct _typeobject *ob_type;
} PyObject;

Run Code Online (Sandbox Code Playgroud)

由此，我们在64位Python构建中获得了该对象123456的以下内存使用情况：

8个字节作为参考计数器（Py_ssize_t）
指向类型对象&PyLong_Type（类型的指针）的8个字节PyTypeObject *
8个字节，用于计算对象的可变长度部分中的字节数；（类型Py_ssize_t）
整数中的每30位数字4个字节。

由于前12位适合123456，因此总计为28，或者 7 * sizeof (int)

除了Python中的每个元素list都是PyObject *指向实际对象的事实之外；这些指针在64位Python构建中都是64位；这意味着每个列表元素引用单独消耗的内存是C的两倍int。

将7和2加在一起，得到9。

要获得更具存储效率的代码，可以使用array ; 使用类型代码时'i'，内存消耗应该非常接近C版本。array有了的append方法，数组的增长比在C / with中还要容易realloc。

OP可以使用[array module]（https://docs.python.org/3.7/library/array.html）减少内存使用量，因为在访问时来回转换值，因此可以提高速度。 (2认同)

归档时间：	6 年，11 月前
查看次数：	121 次
最近记录：	6 年，11 月前