我的目标是建立一个年度cronjob,根据年龄从数据库中删除某些数据.在我看来,我拥有Bash和MySQL的强大功能.我开始编写一个bash脚本,但后来让我感到震惊的是,我可以用一个SQL查询完成所有工作.
我本质上更像程序员而且我对数据结构没有太多经验,所以这就是我想要一些帮助的原因.
此查询的相关表和列如下:
注册:
+-----+-------------------+
| Id | Registration_date |
+-----+-------------------+
| 2 | 2011-10-03 |
| 3 | 2011-10-06 |
| 4 | 2011-10-07 |
| 5 | 2011-10-07 |
| 6 | 2011-10-10 |
| 7 | 2011-10-13 |
| 8 | 2011-10-14 |
| 9 | 2011-10-14 |
| 10 | 2011-10-17 |
+-------------------------+
Run Code Online (Sandbox Code Playgroud)
AssociatedClient:
+-----------+-----------------+
| Client_id | Registration_id |
+-----------+-----------------+
| 2 | 2 |
| 3 | 2 |
| 3 | 4 |
| 4 | 5 |
| 3 | 6 |
| 5 | 6 |
| 3 | 8 |
| 8 | 9 |
| 7 | 10 |
+-----------------------------+
Run Code Online (Sandbox Code Playgroud)
客户:此处仅ID与此相关.
如您所见,这是一个简单的多对多关系.客户可以对其姓名进行多次注册,注册可以有多个客户端.
我需要删除5年内没有新注册的客户的所有注册和客户数据.听起来很简单吧?
如果数据应保持任何其他客户端上的任何登记从一个特定的客户端有5年之内新的注册.
因此,想象一下客户A有4个注册,其中只有他,并且1个注册他自己和客户B.所有5个注册都超过5年.如果客户B在5年内没有新注册,则应删除所有内容:客户A注册和记录.如果B 在5年内确实有新的注册,则应保留所有客户A数据,包括他自己的旧注册.
建立我的查询,我得到了这个远:
DELETE * FROM `Registration` AS Reg
WHERE TIMESTAMPDIFF(YEAR, Reg.`Registration_date`, NOW()) >= 5
AND
(COUNT(`Id`) FROM `Registration` AS Reg2
WHERE Reg2.`Id` IN (SELECT `Registration_id` FROM `AssociatedClient` AS Clients
WHERE Clients.`Client_id` IN (SELECT `Client_id` FROM `AssociatedClient` AS Clients2
WHERE Clients2.`Registration_id` IN -- stuck
#I need all the registrations from the clients associated with the first
# (outer) registration here, that are newer than 5 years.
) = 0 -- No newer registrations from any associated clients
Run Code Online (Sandbox Code Playgroud)
请理解我对SQL的经验非常有限.我意识到,即使我到目前为止所得到的内容也可以大量优化(使用连接等),甚至可能都不正确.
我遇到的原因是,如果我可以使用某种循环,我想到的解决方案就可以工作了,我只是意识到这不是你在这种SQL查询中容易做到的事情.
非常感谢.
Jam*_*den 19
首先确定注册的其他客户的注册.这是一个观点:
create view groups as
select a.Client_id
, c.Registration_id
from AssociatedClient as a
join AssociatedClient as b on a.Registration_id = b.Registration_id
join AssociatedClient as c on b.Client_id = c.Client_id;
Run Code Online (Sandbox Code Playgroud)
这给了我们:
select Client_id
, min(Registration_id) as first
, max(Registration_id) as last
, count(distinct Registration_id) as regs
, count(*) as pals
from groups
group by Client_id;
Client_id first last regs pals
---------- ---------- ---------- ---------- ----------
2 2 8 4 5
3 2 8 4 18
4 5 5 1 1
5 2 8 4 5
7 10 10 1 1
8 9 9 1 1
Run Code Online (Sandbox Code Playgroud)
当然,你不需要观点; 这只是为了方便.你可以使用虚拟表.但仔细检查以说服自己,它为每个客户产生了正确的"朋友注册"范围.请注意,观点并没有引用Registration.这很重要,因为即使我们使用它来删除它也会产生相同的结果Registration,所以我们可以将它用于第二个删除语句.
现在我们有一个客户列表和他们的"朋友注册".每个朋友最后一次注册的日期是什么时候?
select g.Client_id, max(Registration_date) as last_reg
from groups as g join Registration as r
on g.Registration_id = r.Id
group by g.Client_id;
g.Client_id last_reg
----------- ----------
2 2011-10-14
3 2011-10-14
4 2011-10-07
5 2011-10-14
7 2011-10-17
8 2011-10-14
Run Code Online (Sandbox Code Playgroud)
哪一个在确定时间之前有最新日期?
select g.Client_id, max(Registration_date) as last_reg
from groups as g join Registration as r
on g.Registration_id = r.Id
group by g.Client_id
having max(Registration_date) < '2011-10-08';
g.Client_id last_reg
----------- ----------
4 2011-10-07
Run Code Online (Sandbox Code Playgroud)
IIUC意味着应该删除客户#4,并且应该删除他注册的任何内容.注册将是
select * from Registration
where Id in (
select Registration_id from groups as g
where Client_id in (
select g.Client_id
from groups as g join Registration as r
on g.Registration_id = r.Id
group by g.Client_id
having max(Registration_date) < '2011-10-08'
)
);
Id Registration_date
---------- -----------------
5 2011-10-07
Run Code Online (Sandbox Code Playgroud)
而且,当然,客户#4在注册#5中,并且是唯一受此测试删除的客户端.
从那里你可以解决这些delete陈述.我认为规则是"删除客户端和他注册的任何内容".如果是这样,我可能会将注册ID写入临时表,并为两者写入删除Registration并AssociatedClient加入它.
| 归档时间: |
|
| 查看次数: |
924 次 |
| 最近记录: |