Nic*_*yer 8 mysql sql orm sqlalchemy
I have a query generated by SQLAlchemy ORM. It is supposed to retrieve stream_items for a specific course, along with all of their parts - resources, content text blocks, etc., and the users who posted them. However, this query appears to be extremely slow, taking minutes on our production database with 20,000 or so users in the database, 25 or so stream_items for the course, and a couple content text blocks per stream_item. Note that there are very few of any other records besides users in the database because we imported a bunch of users but very little content.
Edit: Note that every object id is a foreign key into the franklin_object table.
I've tried looking at the query, and have identified several troubling bits (looking at the EXPLAIN output)
但是,我真的不知道该怎么办,特别是后两个问题.
这是查询:
SELECT stream_item.id AS stream_item_id,
franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
stream_item.parent_id AS stream_item_parent_id,
stream_item.shown_at AS stream_item_shown_at,
stream_item.author_id AS stream_item_author_id,
stream_item.stream_sort_at AS stream_item_stream_sort_at,
stream_item.PUBLIC AS stream_item_public,
stream_item.created_at AS stream_item_created_at,
stream_item.updated_at AS stream_item_updated_at,
anon_1.content_text_block_text AS anon_1_content_text_block_text,
anon_2.resource_id AS anon_2_resource_id,
anon_2.franklin_object_id AS anon_2_franklin_object_id,
anon_2.franklin_object_type AS anon_2_franklin_object_type,
anon_2.franklin_object_uuid AS anon_2_franklin_object_uuid,
anon_2.resource_top_parent_resource AS anon_2_resource_top_parent_resource,
anon_2.resource_top_parent_id AS anon_2_resource_top_parent_id,
anon_2.resource_title AS anon_2_resource_title,
anon_2.resource_url AS anon_2_resource_url,
anon_2.resource_image AS anon_2_resource_image,
anon_2.resource_created_at AS anon_2_resource_created_at,
anon_2.resource_updated_at AS anon_2_resource_updated_at,
franklin_object_1.id AS franklin_object_1_id,
franklin_object_1.type AS franklin_object_1_type,
franklin_object_1.uuid AS franklin_object_1_uuid,
anon_1.content_text_block_id AS anon_1_content_text_block_id,
anon_1.franklin_object_id AS anon_1_franklin_object_id,
anon_1.franklin_object_type AS anon_1_franklin_object_type,
anon_1.franklin_object_uuid AS anon_1_franklin_object_uuid,
anon_1.content_text_block_position AS anon_1_content_text_block_position,
anon_1.content_text_block_franklin_object_id AS anon_1_content_text_block_franklin_object_id,
anon_1.content_text_block_created_at AS anon_1_content_text_block_created_at,
anon_1.content_text_block_updated_at AS anon_1_content_text_block_updated_at,
anon_3.user_password AS anon_3_user_password,
anon_3.user_auth_token AS anon_3_user_auth_token,
anon_3.user_id AS anon_3_user_id,
anon_3.franklin_object_id AS anon_3_franklin_object_id,
anon_3.franklin_object_type AS anon_3_franklin_object_type,
anon_3.franklin_object_uuid AS anon_3_franklin_object_uuid,
anon_3.user_email AS anon_3_user_email,
anon_3.user_auth_token_expiration AS anon_3_user_auth_token_expiration,
anon_3.user_active AS anon_3_user_active,
anon_3.user_activation_token AS anon_3_user_activation_token,
anon_3.user_first_name AS anon_3_user_first_name,
anon_3.user_last_name AS anon_3_user_last_name,
anon_3.user_image AS anon_3_user_image,
anon_3.user_bio AS anon_3_user_bio,
anon_3.user_aspirations AS anon_3_user_aspirations,
anon_3.user_website AS anon_3_user_website,
anon_3.user_resume AS anon_3_user_resume,
anon_3.user_resume_name AS anon_3_user_resume_name,
anon_3.user_primary_role AS anon_3_user_primary_role,
anon_3.user_institution_id AS anon_3_user_institution_id,
anon_3.user_birth_date AS anon_3_user_birth_date,
anon_3.user_gender AS anon_3_user_gender,
anon_3.user_graduation_year AS anon_3_user_graduation_year,
anon_3.user_complete AS anon_3_user_complete,
anon_3.user_masthead_y_position AS anon_3_user_masthead_y_position,
anon_3.user_masthead AS anon_3_user_masthead,
anon_3.user_fb_access_token AS anon_3_user_fb_access_token,
anon_3.user_fb_user_id AS anon_3_user_fb_user_id,
anon_3.user_location AS anon_3_user_location,
anon_3.user_created_at AS anon_3_user_created_at,
anon_3.user_updated_at AS anon_3_user_updated_at,
anon_4.content_text_block_text AS anon_4_content_text_block_text,
anon_4.content_text_block_id AS anon_4_content_text_block_id,
anon_4.franklin_object_id AS anon_4_franklin_object_id,
anon_4.franklin_object_type AS anon_4_franklin_object_type,
anon_4.franklin_object_uuid AS anon_4_franklin_object_uuid,
anon_4.content_text_block_position AS anon_4_content_text_block_position,
anon_4.content_text_block_franklin_object_id AS anon_4_content_text_block_franklin_object_id,
anon_4.content_text_block_created_at AS anon_4_content_text_block_created_at,
anon_4.content_text_block_updated_at AS anon_4_content_text_block_updated_at,
anon_5.user_password AS anon_5_user_password,
anon_5.user_auth_token AS anon_5_user_auth_token,
anon_5.user_id AS anon_5_user_id,
anon_5.franklin_object_id AS anon_5_franklin_object_id,
anon_5.franklin_object_type AS anon_5_franklin_object_type,
anon_5.franklin_object_uuid AS anon_5_franklin_object_uuid,
anon_5.user_email AS anon_5_user_email,
anon_5.user_auth_token_expiration AS anon_5_user_auth_token_expiration,
anon_5.user_active AS anon_5_user_active,
anon_5.user_activation_token AS anon_5_user_activation_token,
anon_5.user_first_name AS anon_5_user_first_name,
anon_5.user_last_name AS anon_5_user_last_name,
anon_5.user_image AS anon_5_user_image,
anon_5.user_bio AS anon_5_user_bio,
anon_5.user_aspirations AS anon_5_user_aspirations,
anon_5.user_website AS anon_5_user_website,
anon_5.user_resume AS anon_5_user_resume,
anon_5.user_resume_name AS anon_5_user_resume_name,
anon_5.user_primary_role AS anon_5_user_primary_role,
anon_5.user_institution_id AS anon_5_user_institution_id,
anon_5.user_birth_date AS anon_5_user_birth_date,
anon_5.user_gender AS anon_5_user_gender,
anon_5.user_graduation_year AS anon_5_user_graduation_year,
anon_5.user_complete AS anon_5_user_complete,
anon_5.user_masthead_y_position AS anon_5_user_masthead_y_position,
anon_5.user_masthead AS anon_5_user_masthead,
anon_5.user_fb_access_token AS anon_5_user_fb_access_token,
anon_5.user_fb_user_id AS anon_5_user_fb_user_id,
anon_5.user_location AS anon_5_user_location,
anon_5.user_created_at AS anon_5_user_created_at,
anon_5.user_updated_at AS anon_5_user_updated_at,
anon_6.stream_item_id AS anon_6_stream_item_id,
anon_6.franklin_object_id AS anon_6_franklin_object_id,
anon_6.franklin_object_type AS anon_6_franklin_object_type,
anon_6.franklin_object_uuid AS anon_6_franklin_object_uuid,
anon_6.stream_item_parent_id AS anon_6_stream_item_parent_id,
anon_6.stream_item_shown_at AS anon_6_stream_item_shown_at,
anon_6.stream_item_author_id AS anon_6_stream_item_author_id,
anon_6.stream_item_stream_sort_at AS anon_6_stream_item_stream_sort_at,
anon_6.stream_item_public AS anon_6_stream_item_public,
anon_6.stream_item_created_at AS anon_6_stream_item_created_at,
anon_6.stream_item_updated_at AS anon_6_stream_item_updated_at
FROM franklin_object
INNER JOIN stream_item
ON franklin_object.id = stream_item.id
INNER JOIN (SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
content_text_block.id AS content_text_block_id,
content_text_block.text AS content_text_block_text,
content_text_block.position AS content_text_block_position,
content_text_block.franklin_object_id AS content_text_block_franklin_object_id,
content_text_block.created_at AS content_text_block_created_at,
content_text_block.updated_at AS content_text_block_updated_at
FROM franklin_object
INNER JOIN content_text_block
ON franklin_object.id = content_text_block.id) AS anon_1
ON stream_item.id = anon_1.content_text_block_franklin_object_id
LEFT OUTER JOIN contents_resources AS contents_resources_1
ON anon_1.content_text_block_id = contents_resources_1.content_id
LEFT OUTER JOIN (SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
resource.id AS resource_id,
resource.top_parent_resource AS resource_top_parent_resource,
resource.top_parent_id AS resource_top_parent_id,
resource.title AS resource_title,
resource.url AS resource_url,
resource.image AS resource_image,
resource.created_at AS resource_created_at,
resource.updated_at AS resource_updated_at
FROM franklin_object
INNER JOIN resource
ON franklin_object.id = resource.id) AS anon_2
ON anon_2.resource_id = contents_resources_1.resource_id
LEFT OUTER JOIN contents_franklin_objects AS contents_franklin_objects_1
ON anon_1.content_text_block_id = contents_franklin_objects_1.content_id
LEFT OUTER JOIN franklin_object AS franklin_object_1
ON franklin_object_1.id = contents_franklin_objects_1.franklin_object_id
LEFT OUTER JOIN likers AS likers_1
ON stream_item.id = likers_1.post_id
LEFT OUTER JOIN (SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
USER.id AS user_id,
USER.email AS user_email,
USER.password AS user_password,
USER.auth_token AS user_auth_token,
USER.auth_token_expiration AS user_auth_token_expiration,
USER.active AS user_active,
USER.activation_token AS user_activation_token,
USER.first_name AS user_first_name,
USER.last_name AS user_last_name,
USER.image AS user_image,
USER.bio AS user_bio,
USER.aspirations AS user_aspirations,
USER.website AS user_website,
USER.resume AS user_resume,
USER.resume_name AS user_resume_name,
USER.primary_role AS user_primary_role,
USER.institution_id AS user_institution_id,
USER.birth_date AS user_birth_date,
USER.gender AS user_gender,
USER.graduation_year AS user_graduation_year,
USER.complete AS user_complete,
USER.masthead_y_position AS user_masthead_y_position,
USER.masthead AS user_masthead,
USER.fb_access_token AS user_fb_access_token,
USER.fb_user_id AS user_fb_user_id,
USER.location AS user_location,
USER.created_at AS user_created_at,
USER.updated_at AS user_updated_at
FROM franklin_object
INNER JOIN USER
ON franklin_object.id = USER.id) AS anon_3
ON anon_3.user_id = likers_1.user_id
LEFT OUTER JOIN contents_franklin_objects AS contents_franklin_objects_2
ON franklin_object.id = contents_franklin_objects_2.franklin_object_id
LEFT OUTER JOIN (SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
content_text_block.id AS content_text_block_id,
content_text_block.text AS content_text_block_text,
content_text_block.position AS content_text_block_position,
content_text_block.franklin_object_id AS content_text_block_franklin_object_id,
content_text_block.created_at AS content_text_block_created_at,
content_text_block.updated_at AS content_text_block_updated_at
FROM franklin_object
INNER JOIN content_text_block
ON franklin_object.id = content_text_block.id) AS anon_4
ON anon_4.content_text_block_id = contents_franklin_objects_2.content_id
LEFT OUTER JOIN (SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
stream_item.id AS stream_item_id,
stream_item.parent_id AS stream_item_parent_id,
stream_item.shown_at AS stream_item_shown_at,
stream_item.author_id AS stream_item_author_id,
stream_item.stream_sort_at AS stream_item_stream_sort_at,
stream_item.PUBLIC AS stream_item_public,
stream_item.created_at AS stream_item_created_at,
stream_item.updated_at AS stream_item_updated_at
FROM franklin_object
INNER JOIN stream_item
ON franklin_object.id = stream_item.id) AS anon_6
ON anon_6.stream_item_parent_id = franklin_object.id
LEFT OUTER JOIN likers AS likers_2
ON anon_6.stream_item_id = likers_2.post_id
LEFT OUTER JOIN (SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
USER.id AS user_id,
USER.email AS user_email,
USER.password AS user_password,
USER.auth_token AS user_auth_token,
USER.auth_token_expiration AS user_auth_token_expiration,
USER.active AS user_active,
USER.activation_token AS user_activation_token,
USER.first_name AS user_first_name,
USER.last_name AS user_last_name,
USER.image AS user_image,
USER.bio AS user_bio,
USER.aspirations AS user_aspirations,
USER.website AS user_website,
USER.resume AS user_resume,
USER.resume_name AS user_resume_name,
USER.primary_role AS user_primary_role,
USER.institution_id AS user_institution_id,
USER.birth_date AS user_birth_date,
USER.gender AS user_gender,
USER.graduation_year AS user_graduation_year,
USER.complete AS user_complete,
USER.masthead_y_position AS user_masthead_y_position,
USER.masthead AS user_masthead,
USER.fb_access_token AS user_fb_access_token,
USER.fb_user_id AS user_fb_user_id,
USER.location AS user_location,
USER.created_at AS user_created_at,
USER.updated_at AS user_updated_at
FROM franklin_object
INNER JOIN USER
ON franklin_object.id = USER.id) AS anon_5
ON anon_5.user_id = likers_2.user_id
WHERE stream_item.parent_id = 11
ORDER BY stream_item.stream_sort_at DESC,
anon_1.content_text_block_position,
anon_6.stream_item_stream_sort_at DESC
Run Code Online (Sandbox Code Playgroud)
和EXPLAIN输出:
ID SELECT_TYPE TABLE POSSIBLY_KEYS KEY KEY_LEN REF ROWS EXTRA
1 PRIMARY <derived2> ALL NULL NULL NULL NULL 599 Using temporary; Using filesort
1 PRIMARY stream_item eq_ref PRIMARY,parent_id PRIMARY 4 anon_1.content_text_block_franklin_object_id 1 Using where
1 PRIMARY contents_resources_1 ref content_id content_id 5 anon_1.content_text_block_id 2
1 PRIMARY <derived3> ALL NULL NULL NULL NULL 7
1 PRIMARY contents_franklin_objects_1 ref content_id content_id 5 anon_1.content_text_block_id 1
1 PRIMARY franklin_object eq_ref PRIMARY PRIMARY 4 franklin.stream_item.id 1 Using where
1 PRIMARY franklin_object_1 eq_ref PRIMARY PRIMARY 4 franklin.contents_franklin_objects_1.franklin_object_id 1
1 PRIMARY likers_1 ref post_id post_id 5 franklin.stream_item.id 1
1 PRIMARY <derived4> ALL NULL NULL NULL NULL 136
1 PRIMARY contents_franklin_objects_2 ref franklin_object_id franklin_object_id 5 franklin.stream_item.id 1
1 PRIMARY <derived5> ALL NULL NULL NULL NULL 599
1 PRIMARY <derived6> ALL NULL NULL NULL NULL 608
1 PRIMARY likers_2 ref post_id post_id 5 anon_6.stream_item_id 1
1 PRIMARY <derived7> ALL NULL NULL NULL NULL 136
7 DERIVED user ALL PRIMARY NULL NULL NULL 133
7 DERIVED franklin_object eq_ref PRIMARY PRIMARY 4 franklin.user.id 1
6 DERIVED stream_item ALL PRIMARY NULL NULL NULL 709
6 DERIVED franklin_object eq_ref PRIMARY PRIMARY 4 franklin.stream_item.id 1
5 DERIVED content_text_block ALL PRIMARY NULL NULL NULL 666
5 DERIVED franklin_object eq_ref PRIMARY PRIMARY 4 franklin.content_text_block.id 1
4 DERIVED user ALL PRIMARY NULL NULL NULL 133
4 DERIVED franklin_object eq_ref PRIMARY PRIMARY 4 franklin.user.id 1
3 DERIVED resource ALL PRIMARY NULL NULL NULL 7
3 DERIVED franklin_object eq_ref PRIMARY PRIMARY 4 franklin.resource.id 1
2 DERIVED content_text_block ALL PRIMARY NULL NULL NULL 666
2 DERIVED franklin_object eq_ref PRIMARY PRIMARY 4 franklin.content_text_block.id 1
Run Code Online (Sandbox Code Playgroud)
如何将所有查询减少到更快的速度?有什么其他方法可以加快这个速度?
franklin_objects是否设置了反模式?它的工作方式是franklin_object表有两列:id和type.然后每个类型都是一个表,主键是franklin_object的外键.
生成sql的代码有以下几点:
stream_item_query = StreamItem.query.options(db.joinedload('stream_items'),db.joinedload('contents_included_in'),db.joinedload('contents.resources'),db.joinedload('contents.objects'),db.subqueryload('likers'))
stream_items = stream_item_query.filter(StreamItem.parent_id == community_id).order_by(db.desc(StreamItem.stream_sort_at)).all()
哇,这个有点伤害了我的大脑.试图找出查询正在做什么,所有表是什么,以及关系是乏味的.如果你有类似的经历,那么这就是你可能在这个单一查询中尝试做太多的第一个暗示.
我的建议是重新考虑你的整个方法.
SQLAlchemy是一个非常好的工具,我不会打击它(或你选择的mysql),但是和大多数ORM工具一样,你需要考虑使用它们的成本.一个例子是这个franklin_object表业务.这是反模式吗?是和否.从纯粹的OO角度来看它是有道理的.您可以通过id在此表中查找来确定要查询的表.从关系查询的角度来看,它的用途很少.我可以删除franklin_object查询中的每个实例,只丢失...列中的列franklin_object.如果这是一个可行的选择,我会立即这样做.
让我们franklin_object进一步研究这种联系.查看子查询,它们都具有相同的形式:
SELECT franklin_object.id AS franklin_object_id,
franklin_object.type AS franklin_object_type,
franklin_object.uuid AS franklin_object_uuid,
linked_table.id AS linked_table_id,
linked_table.col2 AS col2 --and more
FROM franklin_object
INNER JOIN linked_table
ON franklin_object.id = linked_table.id) AS anon_n
Run Code Online (Sandbox Code Playgroud)
无论统计数据如何,数据库都没有太多信息可以继续优化这部分查询.也许如果franklin_object通过指定typein where子句来限制查询会更好.也许.
这对USER表尤其有问题,因为这个表有很多记录(所以你说).由于您要查询大多数列,并且优化器无法准确计算出将检索的行数,因此执行全表扫描是有意义的.在你的情况下,两次.
另一个方面是涉及的连接数量很多.如果我们取出所有franklin_object引用,仍然有11个连接.如果您的数据模型更具关系性,那并不是很糟糕,但事实并非如此.生成的查询对数据库没有太大帮助,无法找出执行查询的最佳方法,因此它不能很好地完成任务.也许你可以通过暗示等来缓解这个问题,但我敢打赌,从长远来看,这会让你感到害怕.
你正在使用ORM工具,所以真的使用它.通过一次完成如此大的查询,您无法获得任何收益.它可能会因性能而分开一些.执行延迟检索以避免大量复杂的查询.我会说尝试,只是为了看看它是怎么回事,懒洋洋地做所有事情.性能可能会好,我会说更好.不是很好,可能甚至不能接受,但比数据库搅拌时能喝咖啡更好.
然后,开始将事物拼凑成更精简的块.将逻辑上有意义的对象绑在一起,例如resource和contents_resources.另一个例子,之间的连接stream_item,likers并且user是重复的.做一个查询,让SQLAlchemy做它的事情.
作为最后的手段,可以实现某种缓存机制.也许在某处对表格进行非规范化.在缓慢变化,读取繁重的系统上,您可以将这些表格输入到另一个结构中,其中查询是直接且快速的.也就是说,预先进行处理并将其存储在单个表中.
祝好运