Mih*_*hin 9 python orm sqlalchemy eager-loading
假设我们有原始生成的查询:
SELECT company.x AS company_x, ...
FROM company
LEFT OUTER JOIN acc ON acc.id = company.acc
LEFT OUTER JOIN usercomp_links ON company.id = usercomp_links.pid
LEFT OUTER JOIN usergro_links ON acc.id = usergro_links.pid
WHERE usergro_links.eid = %s OR usercomp_links.eid = %s
Run Code Online (Sandbox Code Playgroud)
如果我们加上.options(subqueryload(Company.childs))这个,我们将得到:
SELECT company.x AS company_x, ..., anon_1.company_id AS anon_1_company_id
FROM (
SELECT company.id AS company_id
FROM company
LEFT OUTER JOIN acc ON acc.id = company.acc
LEFT OUTER JOIN usercomp_links ON company.id = usercomp_links.pid
LEFT OUTER JOIN usergro_links ON acc.id = usergro_links.pid
WHERE usergro_links.eid = %s OR usercomp_links.eid = %s) AS anon_1
INNER JOIN acel_links AS acel_links_1 ON anon_1.company_id = acel_links_1.eid
INNER JOIN company ON company.id = acel_links_1.pid ORDER BY anon_1.company_id
Run Code Online (Sandbox Code Playgroud)
这是懒散的.如果我从第一次查询获得公司ID,并且手动加载所有子公司,那么与我们在这种情况下得到的相比,它将非常快.
我已阅读文档,查看代码,但不知道我是否可以告诉sqlalchemy只是从第一个查询的结果获取ID并在单独的,相对简单的查询中加载子项.我不依赖于这个样本 - 当sqlalchemy无法加载构造的查询时,我有更难的情况.为什么要再次从第一次查询中完成所有这些工作呢?
所以任何人都知道如何在没有自动构建"加入加入"风格的情况下急切加载?
更新: “选择进入”策略现在已在SQLAlchemy中实现(从v 1.2开始):请参阅文档中的选择输入。
TLDR:
我认为该joinedload策略应尽可能使用,因为它比其他策略更有效,包括问题策略中建议的使用“ IN”语句加载相关数据的策略。
“ IN”策略可以很容易地在SQLAlchemy的“外部”实现(请参见下面的代码),并且将其实现为新的加载策略可能并不复杂(从逻辑上讲,它类似于现有subqueryload策略)。
完整版本:
我从一个简单的实验开始,以查看不同策略产生的查询
实验的完整源代码在Github上。
我的模型是这样的:
class Author(ModelBase):
__tablename__ = 'authors'
id = Column(Integer, primary_key=True, nullable=False)
name = Column(String(255))
class Book(ModelBase):
__tablename__ = 'books'
id = Column(Integer, primary_key=True)
name = Column(String)
author_id = Column(Integer, ForeignKey('authors.id'))
author = relationship(
'Author', backref=backref('books'))
Run Code Online (Sandbox Code Playgroud)
现在,测试首先是延迟加载:
books = session.query(Book).all()
print books[0].author.name
session.commit()
Run Code Online (Sandbox Code Playgroud)
输出(清理):
-------------Lazy--------------
sqlalchemy.engine.base.Engine:
SELECT
books.id AS books_id, books.name AS books_name, books.author_id AS books_author_id
FROM books
SELECT
authors.id AS authors_id, authors.name AS authors_name
FROM authors
WHERE authors.id = ?
INFO:sqlalchemy.engine.base.Engine:(1,)
author1
Run Code Online (Sandbox Code Playgroud)
正如预期的那样,每次我们访问作者时,惰性加载都会运行一个查询以获取图书,并运行一个查询。
子查询加载:
books = session.query(Book).options(subqueryload(Book.author)).all()
print books[0].author.name
session.commit()
-------------Subquery----------
SELECT
books.id AS books_id, books.name AS books_name, books.author_id AS books_author_id
FROM books
SELECT
authors.id AS authors_id, authors.name AS authors_name,
anon_1.books_author_id AS anon_1_books_author_id
FROM (
SELECT DISTINCT books.author_id AS books_author_id
FROM books) AS anon_1
JOIN authors
ON authors.id = anon_1.books_author_id
ORDER BY anon_1.books_author_id
author1
Run Code Online (Sandbox Code Playgroud)
对于子查询,我们有两个查询,第一个查询使用子查询获取书籍,另一个查询使用作者。
已加入载入:
books = session.query(Book).options(joinedload(Book.author)).all()
print books[0].author.name
session.commit()
-------------Joined------------
SELECT
books.id AS books_id, books.name AS books_name,
books.author_id AS books_author_id,
authors_1.id AS authors_1_id, authors_1.name AS authors_1_name
FROM books
LEFT OUTER JOIN authors AS authors_1 ON authors_1.id = books.author_id
author1
Run Code Online (Sandbox Code Playgroud)
联合策略仅运行一个查询即可获取书籍和作者。
立即加载:
books = session.query(Book).options(immediateload(Book.author)).all()
print books[0].author.name
session.commit()
-------------Immediate---------
SELECT
books.id AS books_id, books.name AS books_name, books.author_id AS books_author_id
FROM books
SELECT
authors.id AS authors_id, authors.name AS authors_name
FROM authors
WHERE authors.id = ?
INFO:sqlalchemy.engine.base.Engine:(1,)
SELECT authors.id AS authors_id, authors.name AS authors_name
FROM authors
WHERE authors.id = ?
INFO:sqlalchemy.engine.base.Engine:(2,)
author1
Run Code Online (Sandbox Code Playgroud)
并且该immediate策略使用第一个查询加载书籍,然后,当我们尝试访问该关系时,使用每个相关记录的单独查询来获取所有相关数据。
看起来“ joinedload()”在大多数情况下应该是最高效的(比“ IN”策略效率更高)-我们仅通过单个查询即可获取所有数据。
现在,让我们尝试在SQL炼金术之外实现IN策略:
print '-------------IN----------------'
books = session.query(Book).all()
ids = set()
for b in books:
ids.add(b.author_id)
authors = session.query(Author).filter(Author.id.in_(ids)).all()
print books[0].author.name
print books[1].author.name
print books[2].author.name
print books[3].author.name
Run Code Online (Sandbox Code Playgroud)
输出:
-------------IN----------------
SELECT
books.id AS books_id, books.name AS books_name, books.author_id AS books_author_id
FROM books
SELECT authors.id AS authors_id, authors.name AS authors_name
FROM authors
WHERE authors.id IN (?, ?)
INFO:sqlalchemy.engine.base.Engine:(1, 2)
author1
author1
author2
author2
Run Code Online (Sandbox Code Playgroud)
如我们所见,它运行两个查询,然后我们可以访问所有作者。
请注意,我们没有将作者明确加入书中,但是当我们尝试通过书访问作者时,它仍然有效,因为SQLAlchemy在内部身份映射中查找作者记录,并且不会运行其他数据库查询。
可以将与上述类似的“ IN”策略代码概括为可与任何模型/关系一起使用的功能。可能,“ IN”策略应该相对容易地实现为新的SQLAlchemy策略,它与现有策略类似subqueryloading-还应该运行第二个查询以获取相关数据。
http://docs.sqlalchemy.org/en/latest/orm/loading_relationships.html#sqlalchemy.orm.selectinload
它已添加到 sqlalchemy 中,因此现在您可以只使用selectinload策略。
| 归档时间: |
|
| 查看次数: |
3104 次 |
| 最近记录: |