按位或“ |” 与加号“ +”在Python中为2的正幂

Sta*_*czo 7 python bitwise-operators

让我们考虑一下这种特殊情况,在这种情况下我想传递一组某些对象的状态。为了方便和灵活(或任意),我选择使用二进制状态,然后将这些状态使用按位或“ |”连接 在我通过它们之前:

status_a = 0b1
status_b = 0b10
status_c = 0b100

statuses_to_pass = status_a | status_c  # 0b101
Run Code Online (Sandbox Code Playgroud)

然后我意识到,在这种情况下,我也可以使用加法算术运算符“ +”:

status_a | status_c == status_a + status_c  
#             0b101 == 0b101  -->  True
Run Code Online (Sandbox Code Playgroud)

当然,当状态为2的正幂时,情况就是这样。还有一些其他警告,例如:

status_a | status_c | status_c == status_a + status_c + status_c  
#                        0b101 == 0b1001  -->  False
Run Code Online (Sandbox Code Playgroud)

但是,让我们假设我处于限制范围之内-是否有任何理由使按位运算符比算术运算符更好?Python背后有东西吗?哪一个更快?也许还有其他我没有想到的副作用?

L3v*_*han 5

实验timeit表明,在以下情况下添加速度更快:

import timeit
import statistics
times = {"+": [], "|": []}

for x in range(10):
    for y in range(x+1, 10):
        for op in "+|":
            t = timeit.timeit(stmt="x {} y".format(op), setup="x=2**{};y=2**{}".format(x, y))
            times[op].append(t)


statistics.mean(times["+"])  # 0.029464346377385986
statistics.mean(times["|"])  # 0.04432822428643703
Run Code Online (Sandbox Code Playgroud)

  • 随着参数的长度(以位为单位)增长并超过了ALU可以本地处理的最大长度,或运算会变得更快。 (2认同)

bat*_*mas 5

当我们进一步查看Python源代码时,我们注意到运算符调用了不同的函数。加法运算符调用,binary_op1()而或运算符调用binary_op()

Python加法运算符(第955行)

PyObject *
PyNumber_Add(PyObject *v, PyObject *w)
{
    PyObject *result = binary_op1(v, w, NB_SLOT(nb_add));
    if (result == Py_NotImplemented) {
        PySequenceMethods *m = v->ob_type->tp_as_sequence;
        Py_DECREF(result);
        if (m && m->sq_concat) {
            return (*m->sq_concat)(v, w);
        }
        result = binop_type_error(v, w, "+");
    }
    return result;
}
Run Code Online (Sandbox Code Playgroud)

Python OR运算符(第941行)

#define BINARY_FUNC(func, op, op_name) \
    PyObject * \
    func(PyObject *v, PyObject *w) { \
        return binary_op(v, w, NB_SLOT(op), op_name); \
    }

BINARY_FUNC(PyNumber_Or, nb_or, "|")
Run Code Online (Sandbox Code Playgroud)

我们可能认为OR运算符会比加法运算符快,但是OR运算符有更多的代码要执行。在Python中,OR运算符的速度较慢,因为binary_op()调用binary_op1()

binary_op(834行)

static PyObject *
binary_op(PyObject *v, PyObject *w, const int op_slot, const char *op_name)
{
    PyObject *result = binary_op1(v, w, op_slot);
    if (result == Py_NotImplemented) {
        Py_DECREF(result);

        if (op_slot == NB_SLOT(nb_rshift) &&
            PyCFunction_Check(v) &&
            strcmp(((PyCFunctionObject *)v)->m_ml->ml_name, "print") == 0)
        {
            PyErr_Format(PyExc_TypeError,
                "unsupported operand type(s) for %.100s: "
                "'%.100s' and '%.100s'. Did you mean \"print(<message>, "
                "file=<output_stream>)\"?",
                op_name,
                v->ob_type->tp_name,
                w->ob_type->tp_name);
            return NULL;
        }

        return binop_type_error(v, w, op_name);
    }
    return result;
}
Run Code Online (Sandbox Code Playgroud)

binary_op1(第785行)

static PyObject *
binary_op1(PyObject *v, PyObject *w, const int op_slot)
{
    PyObject *x;
    binaryfunc slotv = NULL;
    binaryfunc slotw = NULL;

    if (v->ob_type->tp_as_number != NULL)
        slotv = NB_BINOP(v->ob_type->tp_as_number, op_slot);
    if (w->ob_type != v->ob_type &&
        w->ob_type->tp_as_number != NULL) {
        slotw = NB_BINOP(w->ob_type->tp_as_number, op_slot);
        if (slotw == slotv)
            slotw = NULL;
    }
    if (slotv) {
        if (slotw && PyType_IsSubtype(w->ob_type, v->ob_type)) {
            x = slotw(v, w);
            if (x != Py_NotImplemented)
                return x;
            Py_DECREF(x); /* can't do it */
            slotw = NULL;
        }
        x = slotv(v, w);
        if (x != Py_NotImplemented)
            return x;
        Py_DECREF(x); /* can't do it */
    }
    if (slotw) {
        x = slotw(v, w);
        if (x != Py_NotImplemented)
            return x;
        Py_DECREF(x); /* can't do it */
    }
    Py_RETURN_NOTIMPLEMENTED;
}
Run Code Online (Sandbox Code Playgroud)

上面的代码片段abstract.c来自GitHub上CPython项目

最显着的区别是在中的实现longobject.c。使用较小的数字时,加法运算速度更快,效率更高。数字越大,则OR运算符与加法运算符相比变得越快。

x_add(第3020行)

static PyLongObject *
x_add(PyLongObject *a, PyLongObject *b)
{
    Py_ssize_t size_a = Py_ABS(Py_SIZE(a)), size_b = Py_ABS(Py_SIZE(b));
    PyLongObject *z;
    Py_ssize_t i;
    digit carry = 0;

    /* Ensure a is the larger of the two: */
    if (size_a < size_b) {
        { PyLongObject *temp = a; a = b; b = temp; }
        { Py_ssize_t size_temp = size_a;
            size_a = size_b;
            size_b = size_temp; }
    }
    z = _PyLong_New(size_a+1);
    if (z == NULL)
        return NULL;
    for (i = 0; i < size_b; ++i) {
        carry += a->ob_digit[i] + b->ob_digit[i];
        z->ob_digit[i] = carry & PyLong_MASK;
        carry >>= PyLong_SHIFT;
    }
    for (; i < size_a; ++i) {
        carry += a->ob_digit[i];
        z->ob_digit[i] = carry & PyLong_MASK;
        carry >>= PyLong_SHIFT;
    }
    z->ob_digit[i] = carry;
    return long_normalize(z);
}
Run Code Online (Sandbox Code Playgroud)

long_bitwise(第4423行)

static PyObject *
long_bitwise(PyLongObject *a,
             char op,  /* '&', '|', '^' */
             PyLongObject *b)
{
    int nega, negb, negz;
    Py_ssize_t size_a, size_b, size_z, i;
    PyLongObject *z;

    /* Bitwise operations for negative numbers operate as though
       on a two's complement representation.  So convert arguments
       from sign-magnitude to two's complement, and convert the
       result back to sign-magnitude at the end. */

    /* If a is negative, replace it by its two's complement. */
    size_a = Py_ABS(Py_SIZE(a));
    nega = Py_SIZE(a) < 0;
    if (nega) {
        z = _PyLong_New(size_a);
        if (z == NULL)
            return NULL;
        v_complement(z->ob_digit, a->ob_digit, size_a);
        a = z;
    }
    else
        /* Keep reference count consistent. */
        Py_INCREF(a);

    /* Same for b. */
    size_b = Py_ABS(Py_SIZE(b));
    negb = Py_SIZE(b) < 0;
    if (negb) {
        z = _PyLong_New(size_b);
        if (z == NULL) {
            Py_DECREF(a);
            return NULL;
        }
        v_complement(z->ob_digit, b->ob_digit, size_b);
        b = z;
    }
    else
        Py_INCREF(b);

    /* Swap a and b if necessary to ensure size_a >= size_b. */
    if (size_a < size_b) {
        z = a; a = b; b = z;
        size_z = size_a; size_a = size_b; size_b = size_z;
        negz = nega; nega = negb; negb = negz;
    }

    /* JRH: The original logic here was to allocate the result value (z)
       as the longer of the two operands.  However, there are some cases
       where the result is guaranteed to be shorter than that: AND of two
       positives, OR of two negatives: use the shorter number.  AND with
       mixed signs: use the positive number.  OR with mixed signs: use the
       negative number.
    */
    switch (op) {
    case '^':
        negz = nega ^ negb;
        size_z = size_a;
        break;
    case '&':
        negz = nega & negb;
        size_z = negb ? size_a : size_b;
        break;
    case '|':
        negz = nega | negb;
        size_z = negb ? size_b : size_a;
        break;
    default:
        PyErr_BadArgument();
        return NULL;
    }

    /* We allow an extra digit if z is negative, to make sure that
       the final two's complement of z doesn't overflow. */
    z = _PyLong_New(size_z + negz);
    if (z == NULL) {
        Py_DECREF(a);
        Py_DECREF(b);
        return NULL;
    }

    /* Compute digits for overlap of a and b. */
    switch(op) {
    case '&':
        for (i = 0; i < size_b; ++i)
            z->ob_digit[i] = a->ob_digit[i] & b->ob_digit[i];
        break;
    case '|':
        for (i = 0; i < size_b; ++i)
            z->ob_digit[i] = a->ob_digit[i] | b->ob_digit[i];
        break;
    case '^':
        for (i = 0; i < size_b; ++i)
            z->ob_digit[i] = a->ob_digit[i] ^ b->ob_digit[i];
        break;
    default:
        PyErr_BadArgument();
        return NULL;
    }

    /* Copy any remaining digits of a, inverting if necessary. */
    if (op == '^' && negb)
        for (; i < size_z; ++i)
            z->ob_digit[i] = a->ob_digit[i] ^ PyLong_MASK;
    else if (i < size_z)
        memcpy(&z->ob_digit[i], &a->ob_digit[i],
               (size_z-i)*sizeof(digit));

    /* Complement result if negative. */
    if (negz) {
        Py_SIZE(z) = -(Py_SIZE(z));
        z->ob_digit[size_z] = PyLong_MASK;
        v_complement(z->ob_digit, z->ob_digit, size_z+1);
    }

    Py_DECREF(a);
    Py_DECREF(b);
    return (PyObject *)maybe_small_long(long_normalize(z));
}
Run Code Online (Sandbox Code Playgroud)

  • 我不认为`binary_op`和`binary_op1`之间的差异是造成性能差异的唯一原因。我尝试了其他调用`binary_op`的操作(例如`-`和`&`),其中大多数操作比`|`运行得更快。尽管`+`和`-`的运行时间相似,尽管后者称为`binary_op`。 (2认同)
  • 这是用于处理抽象数字,而不仅仅是`int`s。对于`int`,可以在[here](https://github.com/python/cpython/blob/762f93ff2efd6b7ef0177cad57939c0ab2002eac/Objects/longobject.c#L5688-L5723)中找到。最终,`|运行[this](https://github.com/python/cpython/blob/762f93ff2efd6b7ef0177cad57939c0ab2002eac/Objects/longobject.c#L4716-L4835),而`+`将会运行[this](https:// github.com/python/cpython/blob/762f93ff2efd6b7ef0177cad57939c0ab2002eac/Objects/longobject.c#L3151-L3181)。 (2认同)