如何从单列必须包含两个(或更多)值的表中选择项目?

use*_*182 6 mysql select

我有一个 MySQL 数据库表,它引用了不同的单词及其在文档中的位置。我想返回包含所有单词的文档的 ID。

这是一个示例表。

docid     wordid
1         4
2         4
1         2
1         5
Run Code Online (Sandbox Code Playgroud)

好的,现在假设有人在数据库中查询了 WORDID 为 4、2 和 5 的单词。

我错误的 SQL SELECT 语句类似于:

Select docid from table where wordid = 4 and wordid = 2 and wordid = 5
Run Code Online (Sandbox Code Playgroud)

这给了我 0 结果。

我在其他地方看到where in有人建议使用该条款:

如果我理解正确,这是编写 OR 子句的另一种方式。我试过这个:

select docid from table where wordid in (4,2,5)
Run Code Online (Sandbox Code Playgroud)

但是,这给了我所有的结果。它应该排除 docid 2,因为它不包含其他词。我期待获得 docid 1。

但是,我可能会where in错误地使用该子句,因为我的数据库经验很少。

如何返回包含所有单词的 docid?

另请注意,我的 where 子句将在 FOR 循环中动态生成。查询可以是一两个词那么简单,也可以是 10 或 12 个词。我正在寻找一种考虑速度的查询结构。如果您需要更多信息,请告诉我。

作为参考,我正在尝试将此代码转换为 PHP/MYSQL,但我不理解此处的 sql 语句或 MYSQL 中的等效语句:

http://my.safaribooksonline.com/book/web-development/9780596529321/4dot-searching-and-ranking/querying

ype*_*eᵀᴹ 9

这是关系除法问题,SO 有一个问题,有很多方法可以编写此查询,加上 PostgreSQL 的性能分析:如何过滤具有多通关系的 SQL 结果

无耻地在那里复制代码并删除/更改具有 MySQL 缺乏功能的答案的代码,例如 CTE EXCEPTINTERSECT、 等,这里有一些方法可以做到这一点。

假设:

  • 该表被称为 factors
  • 有一个UNIQUE约束(wordid, docid)
  • documents一张words桌子:

易写,中等效率:

-- Query 1 -- by Martin
SELECT d.docid, d.docname
FROM   document d
JOIN   factors f USING (docid)
WHERE  f.wordid IN (2, 4, 5)
GROUP  BY d.docid
HAVING COUNT(*) = 3 ;           -- number of words
Run Code Online (Sandbox Code Playgroud)

易写,中等效率:

-- Query 2 -- by Erwin
SELECT d.docid, d.docname
FROM   documents d
JOIN   (
   SELECT docid
   FROM   factors
   WHERE  wordid IN (2, 4, 5)
   GROUP  BY docid
   HAVING COUNT(*) = 3
   ) f USING (docid) ;
Run Code Online (Sandbox Code Playgroud)

写起来更复杂,在 Postgres 中效率非常好——在 MySQL 中可能很糟糕:

-- Query 4 -- by Derek
SELECT d.docid, d.docname
FROM   documents d
WHERE  d.docid IN (SELECT docid FROM factors WHERE wordid = 2)
AND    d.docid IN (SELECT docid FROM factors WHERE wordid = 4);
AND    d.docid IN (SELECT docid FROM factors WHERE wordid = 5);
Run Code Online (Sandbox Code Playgroud)

编写起来更复杂,在 Postgres 中效率非常好——在 MySQL 中可能也是如此:

-- Query 5 -- by Erwin
SELECT d.docid, d.docname
FROM   documents d
WHERE  EXISTS (SELECT * FROM factors 
               WHERE  docid = d.docid AND wordid = 2)
AND    EXISTS (SELECT * FROM factors 
               WHERE  docid = d.docid AND wordid = 4)
AND    EXISTS (SELECT * FROM factors 
               WHERE  docid = d.docid AND wordid = 5) ;
Run Code Online (Sandbox Code Playgroud)

编写起来更复杂,在 Postgres 中效率非常好——在 MySQL 中可能也是如此:

-- Query 6 -- by Sean
SELECT d.docid, d.docname
FROM   documents d
JOIN   factors x ON d.docid = x.docid
JOIN   factors y ON d.docid = y.docid
JOIN   factors z ON d.docid = z.docid
WHERE  x.wordid = 2
AND    y.wordid = 4
AND    z.wordid = 5 ;
Run Code Online (Sandbox Code Playgroud)

易于编写和扩展到任意一组,words但效率不如JOINEXISTS解决方案:

-- Query 7 -- by ypercube
SELECT d.docid, d.docname
FROM   documents d
WHERE  NOT EXISTS (
   SELECT *
   FROM   words AS w 
   WHERE  w.wordid IN (2, 4, 5)
   AND    NOT EXISTS (
      SELECT *
      FROM   factors AS f 
      WHERE  f.docid = d.docid 
      AND    f.wordid = w.wordid 
      )
   );
Run Code Online (Sandbox Code Playgroud)

写起来容易,效率不高:

-- Query 8 -- by ypercube
SELECT d.docid, d.docname
FROM   documents d
WHERE  NOT EXISTS (
   SELECT *
   FROM  (
      SELECT 2 AS wordid UNION  ALL
      SELECT 4 UNION ALL
      SELECT 5
      ) AS w
   WHERE NOT EXISTS (
      SELECT *
      FROM   factors AS f 
      WHERE  f.docid = d.docid 
      AND    f.wordid = w.wordid 
      )
   );
Run Code Online (Sandbox Code Playgroud)

喜欢测试它们:)