Chr*_*ong 14 mysql join database-design
我们有一个表,用于存储问题的答案。我们需要能够找到对特定问题有特定答案的用户。因此,如果我们的表包含以下数据:
user_id question_id answer_value
Sally 1 Pooch
Sally 2 Peach
John 1 Pooch
John 2 Duke
Run Code Online (Sandbox Code Playgroud)
并且我们想要找到回答问题 1 的“Pooch”和回答问题 2 的“Peach”的用户,以下 SQL 将(显然)不起作用:
select user_id
from answers
where question_id=1
and answer_value = 'Pooch'
and question_id=2
and answer_value='Peach'
Run Code Online (Sandbox Code Playgroud)
我的第一个想法是为我们正在寻找的每个答案自行加入表格:
select a.user_id
from answers a, answers b
where a.user_id = b.user_id
and a.question_id=1
and a.answer_value = 'Pooch'
and b.question_id=2
and b.answer_value='Peach'
Run Code Online (Sandbox Code Playgroud)
这是有效的,但由于我们允许任意数量的搜索过滤器,我们需要找到更有效的东西。我的下一个解决方案是这样的:
select user_id, count(question_id)
from answers
where (
(question_id=2 and answer_value = 'Peach')
or (question_id=1 and answer_value = 'Pooch')
)
group by user_id
having count(question_id)>1
Run Code Online (Sandbox Code Playgroud)
但是,我们希望用户能够两次填写同一份问卷,因此他们可能在答案表中对问题 1 有两个答案。
所以,现在我不知所措。解决这个问题的最佳方法是什么?谢谢!
我找到了一种无需自联接即可执行此查询的巧妙方法。
我在 MySQL 5.5.8 for Windows 中运行这些命令并得到以下结果:
use test
DROP TABLE IF EXISTS answers;
CREATE TABLE answers (user_id VARCHAR(10),question_id INT,answer_value VARCHAR(20));
INSERT INTO answers VALUES
('Sally',1,'Pouch'),
('Sally',2,'Peach'),
('John',1,'Pooch'),
('John',2,'Duke');
INSERT INTO answers VALUES
('Sally',1,'Pooch'),
('Sally',2,'Peach'),
('John',1,'Pooch'),
('John',2,'Duck');
SELECT user_id,question_id,GROUP_CONCAT(DISTINCT answer_value) given_answers
FROM answers GROUP BY user_id,question_id;
+---------+-------------+---------------+
| user_id | question_id | given_answers |
+---------+-------------+---------------+
| John | 1 | Pooch |
| John | 2 | Duke,Duck |
| Sally | 1 | Pouch,Pooch |
| Sally | 2 | Peach |
+---------+-------------+---------------+
Run Code Online (Sandbox Code Playgroud)
此显示显示 John 对问题 2 给出了两个不同的答案,而 Sally 对问题 1 给出了两个不同的答案。
要了解所有用户对哪些问题的回答不同,只需将上述查询放在子查询中并检查给定答案列表中的逗号以获取不同答案的数量,如下所示:
SELECT user_id,question_id,given_answers,
(LENGTH(given_answers) - LENGTH(REPLACE(given_answers,',','')))+1 multianswer_count
FROM (SELECT user_id,question_id,GROUP_CONCAT(DISTINCT answer_value) given_answers
FROM answers GROUP BY user_id,question_id) A;
Run Code Online (Sandbox Code Playgroud)
我懂了:
+---------+-------------+---------------+-------------------+
| user_id | question_id | given_answers | multianswer_count |
+---------+-------------+---------------+-------------------+
| John | 1 | Pooch | 1 |
| John | 2 | Duke,Duck | 2 |
| Sally | 1 | Pouch,Pooch | 2 |
| Sally | 2 | Peach | 1 |
+---------+-------------+---------------+-------------------+
Run Code Online (Sandbox Code Playgroud)
现在只需使用另一个子查询过滤出 multianswer_count = 1 的行:
SELECT * FROM (SELECT user_id,question_id,given_answers,
(LENGTH(given_answers) - LENGTH(REPLACE(given_answers,',','')))+1 multianswer_count
FROM (SELECT user_id,question_id,GROUP_CONCAT(DISTINCT answer_value) given_answers
FROM answers GROUP BY user_id,question_id) A) AA WHERE multianswer_count > 1;
Run Code Online (Sandbox Code Playgroud)
这是我得到的:
+---------+-------------+---------------+-------------------+
| user_id | question_id | given_answers | multianswer_count |
+---------+-------------+---------------+-------------------+
| John | 2 | Duke,Duck | 2 |
| Sally | 1 | Pouch,Pooch | 2 |
+---------+-------------+---------------+-------------------+
Run Code Online (Sandbox Code Playgroud)
本质上,我执行了三个表扫描:1 次在主表上,2 次在小子查询上。没有加入!!!
试一试 !!!
我自己喜欢 join 方法:
SELECT a.user_id FROM answers a
INNER JOIN answers a1 ON a1.question_id=1 AND a1.answer_value='Pooch'
INNER JOIN answers a2 ON a2.question_id=2 AND a2.answer_value='Peach'
GROUP BY a.user_id
Run Code Online (Sandbox Code Playgroud)
更新
在使用更大的表(约 100 万行)进行测试后,此方法比OR原始问题中提到的简单方法花费的时间要长得多。
我们加入了user_id从answers表中链的连接,以从其他表中的数据,但隔离回答表SQL,在这种简单的术语写它帮助我发现了解决方案:
SELECT user_id, COUNT(question_id)
FROM answers
WHERE
(question_id = 2 AND answer_value = 'Peach')
OR (question_id = 1 AND answer_value = 'Pooch')
GROUP by user_id
HAVING COUNT(question_id) > 1
Run Code Online (Sandbox Code Playgroud)
我们不必要地使用了第二个子查询。
| 归档时间: |
|
| 查看次数: |
127294 次 |
| 最近记录: |