如何从三个不同的表中获取数据

use*_*060 1 mysql

这个问题是这个问题的延续:如何从两个不同的表中获取数据?

现在我需要将表Sessions与 2 个表连接起来:

  • Input
  • Downloads

当我将此查询与真实数据一起使用时:

SELECT sessions.ip
    , COUNT(sessions.id)
    , COUNT(input.input) as TotalInputs
    , COUNT(DISTINCT input.input) as UniqInputs
    , COUNT(downloads.shasum) as files
    , COUNT(DISTINCT download.shasum) as Uniqfiles 
FROM sessions, input, downloads 
WHERE sessions.id = input.session 
    AND sessions.session = downloads.session 
    AND date_format(sessions.starttime, '%Y-%m-%d') > "2015-01-01" 
GROUP BY sessions.ip 
ORDER BY COUNT(sessions.id) DESC LIMIT 5;
Run Code Online (Sandbox Code Playgroud)

我得到这个输出:

ip  | COUNT(sessions.id)    | TotalInputs   | UniqInputs    | files | Uniqfiles
IP1 |              11145    |       11145   | 15            | 11145 |         8
IP2 |               9125    |        9125   | 71            |     0 |         0
IP3 |               7882    |        7882   | 56            |  7882 |        19
Run Code Online (Sandbox Code Playgroud)

但是count(sessions.id),TotalInputs和 的数字Files并不准确。例如,如果我使用这个查询:

SELECT downloads.shasum 
FROM sessions, downloads 
WHERE sessions.id = downloads.session 
    AND date_format(sessions.starttime, '%Y-%m-%d') > "2015-01-01" 
    AND sessions.ip = "IP3";
Run Code Online (Sandbox Code Playgroud)

我发现FilesIP3 的计数具有正确的值 752(不是 7882)。的实际价值TotalInputs小于COUNT(sessions.id)

如何修复我的查询?


SQL Fiddle上提供了示例数据。

使用上面的查询和下面的示例数据,我得到了这个输出:

ip  | COUNT(sessions.id)    | TotalInputs   | UniqInputs    | files | uniq_files
IP2 | 3                     | 3             | 2             | 3     | 1
IP3 | 8                     | 8             | 4             | 8     | 2
Run Code Online (Sandbox Code Playgroud)

我需要这个输出:

ip  | COUNT(sessions.id)    | TotalInputs   | UniqInputs    | files | uniq_files
IP2 | 1                     | 3             | 2             | 1     | 1
IP3 | 3                     | 5             | 4             | 3     | 2
Run Code Online (Sandbox Code Playgroud)

如何更新我的查询?


示例会话数据:

id      | starttime             | endtime               | sensor    | ip    | termsize  | client 
id1     | 2015-05-07 11:01:20   | 2015-05-07 18:01:32   | 10        | IP3   | 80x50     | 3
id2     | 2015-05-07 18:03:20   | 2015-03-07 18:11:32   | 2         | IP2   | 80x50     | 1
id3     | 2015-05-07 23:05:20   | 2015-06-07 18:10:32   | 10        | IP3   | 80x70     | 3
id4     | 2015-05-07 13:05:20   | 2015-05-09 20:05:32   | 7         | IP3   | 60x30     | 5
Run Code Online (Sandbox Code Playgroud)

样本输入数据:

id  | session   | timestamp             | realm | success   | input
1   | id1       | 2015-07-13 10:29:18   | NULL  | 1         | date
2   | id3       | 2015-08-13 10:11:18   | NULL  | 0         | aaa
3   | id1       | 2015-03-13 10:11:18   | NULL  | 0         | aaa
4   | id1       | 2015-07-14 10:33:15   | NULL  | 1         | uname
5   | id3       | 2015-05-19 20:33:11   | NULL  | 1         | netstat
6   | id2       | 2015-09-22 10:53:21   | NULL  | 1         | pwd
7   | id2       | 2015-09-22 10:58:11   | NULL  | 1         | pwd
8   | id2       | 2015-11-03 09:53:07   | NULL  | 0         | bbb
Run Code Online (Sandbox Code Playgroud)

示例下载数据:

id  | session   | timestamp             | url           | outfile   | shasum
1   | id1       | 2014-07-13 12:15:47   | http://xxx    | xxx       | SHA1
2   | id2       | 2014-09-13 12:18:50   | http://xxx2   | xxx2      | SHA2
3   | id1       | 2015-09-11 13:20:50   | http://xxx3   | xxx3      | SHA1
4   | id3       | 2016-01-19 18:21:30   | http://xxx4   | xxx4      | SHA3
Run Code Online (Sandbox Code Playgroud)

Jul*_*eur 5

这个查询:

  • 使用 ANSI JOIN (LEFT, INNER, ...)
  • 用途LEFT JOIN,以便为每个表来计算会话的IP没有InputDownloads
  • 用途DISTINCT为每个COUNT以删除重复所添加的JOIN表之间
  • 计算总数的值并计算唯一计数的 id

询问:

SELECT s.ip
    , COUNT(DISTINCT s.id)
    , COUNT(DISTINCT i.id) as TotalInputs
    , COUNT(DISTINCT i.input) as UniqInputs
    , COUNT(DISTINCT d.id) as files
    , COUNT(DISTINCT d.shasum) as Uniqfiles 
FROM sessions s
LEFT JOIN input i
    ON s.id = i.session
LEFT JOIN downloads d
    ON s.id = d.session 
GROUP BY s.ip;
Run Code Online (Sandbox Code Playgroud)

SQL小提琴

输出:

ip  | COUNT(DISTINCT s.id)  | TotalInputs   | UniqInputs    | files | Uniqfiles
IP2 |       1               |       3       |   2           | 1     |   1
IP3 |       3               |       5       |   4           | 3     |   2
Run Code Online (Sandbox Code Playgroud)