为什么python SqlAlchemy Boolean和Integer Type之间存在大的插入性能差异

Jam*_*ngo 7 python sqlite sqlalchemy

使用Python和Sqlalchemy将相同的值作为布尔或整数存储在sqlite数据库中会产生以下结果.

Value stored as Boolean:
SqlAlchemy ORM: Total time for 40000 records 62.5009999275 secs
SqlAlchemy Core: Total time for 40000 records 56.0600001812 secs
Value stored as Integer:
SqlAlchemy ORM: Total time for 40000 records 5.72099995613 secs
SqlAlchemy Core: Total time for 40000 records 0.770999908447 secs
Run Code Online (Sandbox Code Playgroud)

使用布尔类型时为什么会出现这样的性能问题?

我知道SQLite没有布尔类型的概念,而是将它们存储为整数1(True)或0(False).我原以为SqlAlchemy会把python bool映射到Sqlite整数.

用于产生上述(改性从输出脚本这个问题):

import time
import sqlite3

from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy import Column, Integer, String,  create_engine, Boolean
from sqlalchemy.orm import scoped_session, sessionmaker

Base = declarative_base()
DBSession = scoped_session(sessionmaker())

class CustomerInteger(Base):
    __tablename__ = "customerInteger"
    id = Column(Integer, primary_key=True)
    name = Column(String(255))
    value = Column(Integer)

class CustomerBoolean(Base):
    __tablename__ = "customerBoolean"
    id = Column(Integer, primary_key=True)
    name = Column(String(255))
    value = Column(Boolean)

def init_sqlalchemy(dbname = 'sqlite:///sqlalchemy.db'):
    global engine
    engine = create_engine(dbname, echo=False)
    DBSession.remove()
    DBSession.configure(bind=engine, autoflush=False, expire_on_commit=False)
    Base.metadata.drop_all(engine)
    Base.metadata.create_all(engine)

def test_sqlalchemy_orm(n, table):
    init_sqlalchemy()
    t0 = time.time()
    for i in range(n):
        customer = table()
        customer.name = 'NAME ' + str(i)
        customer.value = True
        DBSession.add(customer)
        if i % 1000 == 0:
            DBSession.flush()
    DBSession.commit()
    print "SqlAlchemy ORM: Total time for " + str(n) + " records " + str(time.time() - t0) + " secs"


def test_sqlalchemy_core(n, table):
    init_sqlalchemy()
    t0 = time.time()
    engine.execute(
        table.__table__.insert(),
        [{"name":'NAME ' + str(i), "value":True } for i in range(n)]
    )
    print "SqlAlchemy Core: Total time for " + str(n) + " records " + str(time.time() - t0) + " secs"


if __name__ == '__main__':

    print "Value stored as Boolean:"
    test_sqlalchemy_orm(40000, CustomerBoolean)
    test_sqlalchemy_core(40000, CustomerBoolean)

    print "Value stored as Integer:"
    test_sqlalchemy_orm(40000, CustomerInteger)
    test_sqlalchemy_core(40000, CustomerInteger)
Run Code Online (Sandbox Code Playgroud)

vvl*_*rov 3

我对三种配置进行了测试。虽然 Boolean 和 Integer 的运行时间存在差异,但不是 10 倍。可能你想尝试切换到另一个 python 版本。

附言。我在运行 Windows 8 的 Core i5 M430 CPU 机器上运行测试。

另外,我建议运行探查器来查看 sqlalchemy 在您的系统上运行时在哪里花费了那么多时间。

1)

python: 2.6.2 (r262:71605, Apr 14 2009, 22:40:02) [MSC v.1500 32 bit (Intel)]
sqlalchemy: 0.7.8
Value stored as Boolean:
SqlAlchemy ORM: Total time for 40000 records 8.84400010109 secs
SqlAlchemy Core: Total time for 40000 records 0.725000143051 secs
Value stored as Integer:
SqlAlchemy ORM: Total time for 40000 records 8.0680000782 secs
SqlAlchemy Core: Total time for 40000 records 0.443000078201 secs
Run Code Online (Sandbox Code Playgroud)

2)

python: 2.7.2 (default, Jun 12 2011, 15:08:59) [MSC v.1500 32 bit (Intel)]
sqlalchemy: 0.8.1
Value stored as Boolean:
SqlAlchemy ORM: Total time for 40000 records 9.69299983978 secs
SqlAlchemy Core: Total time for 40000 records 0.572000026703 secs
Value stored as Integer:
SqlAlchemy ORM: Total time for 40000 records 9.35899996758 secs
SqlAlchemy Core: Total time for 40000 records 0.40700006485 secs
Run Code Online (Sandbox Code Playgroud)

3)

python: 3.2.3 (default, Apr 11 2012, 07:15:24) [MSC v.1500 32 bit (Intel)]
sqlalchemy: 0.8.1
Value stored as Boolean:
SqlAlchemy ORM: Total time for 40000 records 8.531000137329102 secs
SqlAlchemy Core: Total time for 40000 records 0.7139999866485596 secs
Value stored as Integer:
SqlAlchemy ORM: Total time for 40000 records 8.023000001907349 secs
SqlAlchemy Core: Total time for 40000 records 0.44099998474121094 secs
Run Code Online (Sandbox Code Playgroud)