Cra*_*erb 5 mysql database optimization
好的,这是我现在正在运行的一个查询,它有45,000条记录,大小为65MB ......并且即将变得越来越大(所以我也要考虑未来的性能):
SELECT count(payment_id) as signup_count, sum(amount) as signup_amount
FROM payments p
WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND completed > 0
AND tm_completed IS NOT NULL
AND member_id NOT IN (SELECT p2.member_id FROM payments p2 WHERE p2.completed=1 AND p2.tm_completed < '2009-05-01' AND p2.tm_completed IS NOT NULL GROUP BY p2.member_id)
Run Code Online (Sandbox Code Playgroud)
正如你可能想象的那样 - 它会让mysql服务器陷入停顿......
它的作用是 - 它只是提取已注册的新用户数量,至少有一个"已完成"付款,tm_completed不为空(因为它仅填充已完成的付款),以及(成员具有的嵌入式选择)从来没有"完成"付款 - 这意味着他是一个新成员(只是因为系统确实重新发布了等等,这是区分刚刚被重新招募的现有成员和新成员之间的唯一方法第一次).
现在,是否有任何可能的方法来优化此查询以使用更少的资源或其他东西,并停止将我的mysql资源放在他们的膝盖上......?
我是否遗漏了任何信息以进一步澄清这一点?让我知道...
编辑:
以下是该表上已有的索引:
PRIMARY PRIMARY 46757 payment_id
member_id INDEX 23378 member_id
payer_id INDEX 11689 payer_id
coupon_id INDEX 1 coupon_id
tm_added INDEX 46757 tm_added,product_id
tm_completed INDEX 46757 tm_completed,product_id
这些类型的IN子查询在MySQL中有点慢.我会这样改写:
SELECT COUNT(1) AS signup_count, SUM(amount) AS signup_amount
FROM payments p
WHERE tm_completed BETWEEN '2009-05-01' AND '2009-05-30'
AND completed > 0
AND NOT EXISTS (
SELECT member_id
FROM payments
WHERE member_id = p.member_id
AND completed = 1
AND tm_completed < '2009-05-01');
Run Code Online (Sandbox Code Playgroud)
tm_completed IS NOT NULL不需要检查' ',因为您的BETWEEN情况暗示了这一点.
还要确保你有一个索引:
(tm_completed, completed)
Run Code Online (Sandbox Code Playgroud)
我很高兴把这个解决方案放在一起,不需要子查询:
SELECT count(p1.payment_id) as signup_count,
sum(p1.amount) as signup_amount
FROM payments p1
LEFT JOIN payments p2
ON p1.member_id = p2.member_id
AND p2.completed = 1
AND p2.tm_completed < date '2009-05-01'
WHERE p1.completed > 0
AND p1.tm_completed between date '2009-05-01' and date '2009-05-30'
AND p2.member_id IS NULL;
Run Code Online (Sandbox Code Playgroud)