Dav*_*man 4 python postgresql sqlalchemy
我在 SQLAlchemy 中设置了一个查询,该查询运行得有点慢,尝试对其进行优化。由于未知的原因,结果使用了隐式交叉连接,这不仅速度慢得多,而且产生完全错误的结果。I\xe2\x80\x99 对表名和参数进行了匿名处理,但没有进行任何更改。有谁知道这是从哪里来的?
\n\n为了更容易找到:新发出的 SQL 和旧发出的 SQL 的区别在于新发出的 SQL 具有更长的 SELECT,并且在任何 JOIN 之前的 WHERE 中提到了所有三个表。
\n\n原始代码:
\n\ncust_name = u'Bob'\nproj_name = u'job1'\nitem_color = u'blue'\nquery = (db.session.query(Item.name)\n .join(Project, Customer)\n .filter(Customer.name == cust_name,\n Project.name == proj_name)\n .distinct(Item.name))\n\n# some conditionals determining last filter, resolving to this one:\nquery = query.filter(Item.color == item_color)\n\nresult = query.all()\nRun Code Online (Sandbox Code Playgroud)\n\n由flask_sqlalchemy.get_debug_queries记录的原始发出的SQL:
\n\nQUERY: SELECT DISTINCT ON (items.name) items.name AS items_name\nFROM items JOIN projects ON projects.id = items._project_id JOIN customers ON customers.id = projects._customer_id\nWHERE customers.name = %(name_1)s AND projects.name = %(name_2)s AND items.color = %(color_1)s\nParameters: `{'name_2': u'job1', 'state_1': u'blue', 'name_1': u'Bob'}\nRun Code Online (Sandbox Code Playgroud)\n\n新代码:
\n\ncust_name = u'Bob'\nproj_name = u'job1'\nitem_color = u'blue'\nquery = (db.session.query(Item)\n .options(Load(Item).load_only('name', 'color'),\n joinedload(Item.project, innerjoin=True).load_only('name').\n joinedload(Project.customer, innerjoin=True).load_only('name'))\n .filter(Customer.name == cust_name,\n Project.name == proj_name)\n .distinct(Item.name))\n\n# some conditionals determining last filter, resolving to this one:\nquery = query.filter(Item.color == item_color)\n\nresult = query.all()\nRun Code Online (Sandbox Code Playgroud)\n\n由flask_sqlalchemy.get_debug_queries记录的新发出的SQL:
\n\nQUERY: SELECT DISTINCT ON (items.nygc_id) items.id AS items_id, items.name AS items_name, items.color AS items_color, items._project_id AS items__project_id, customers_1.id AS customers_1_id, customers_1.name AS customers_1_name, projects_1.id AS projects_1_id, projects_1.name AS projects_1_name\nFROM customers, projects, items JOIN projects AS projects_1 ON projects_1.id = items._project_id JOIN customers AS customers_1 ON customers_1.id = projects_1._customer_id\nWHERE customers.name = %(name_1)s AND projects.name = %(name_2)s AND items.color = %(color_1)s\nParameters: `{'state_1': u'blue', 'name_2': u'job1', 'name_1': u'Bob'}\nRun Code Online (Sandbox Code Playgroud)\n\n如果重要的话,底层数据库是 PostgreSQL。
\n\n查询的初衷只需要Item.name. 我思考的时间越长,优化尝试看起来就越不可能真正有帮助,但我仍然想知道交叉连接来自哪里,以防它再次发生在添加joinedload、load_only等实际上会有所帮助的地方。
这是因为 ajoinedload与 a 不同join。ed实体joinedload实际上是匿名的,并且您应用的后续过滤器引用同一表的不同实例,因此customers和projects被连接两次。
您应该做的是像以前一样执行 a join,但使用contains_eager使您的连接看起来像joinedload。
query = (session.query(Item)
.join(Item.project)
.join(Project.customer)
.options(Load(Item).load_only('name', 'color'),
Load(Item).contains_eager("project").load_only('name'),
Load(Item).contains_eager("project").contains_eager("customer").load_only('name'))
.filter(Customer.name == cust_name,
Project.name == proj_name)
.distinct(Item.name))
Run Code Online (Sandbox Code Playgroud)
这给你查询
SELECT DISTINCT ON (items.name) customers.id AS customers_id, customers.name AS customers_name, projects.id AS projects_id, projects.name AS projects_name, items.id AS items_id, items.name AS items_name, items.color AS items_color
FROM items JOIN projects ON projects.id = items._project_id JOIN customers ON customers.id = projects._customer_id
WHERE customers.name = %(name_1)s AND projects.name = %(name_2)s AND items.color = %(color_1)s
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
11180 次 |
| 最近记录: |