例如,如果我在这样的列中有数据
data
I love book
I love apple
I love book
I hate apple
I hate apple
Run Code Online (Sandbox Code Playgroud)
我怎样才能得到这样的结果
I = 5
love = 3
hate = 2
book = 2
apple = 3
Run Code Online (Sandbox Code Playgroud)
我们可以用 MySQL 来实现吗?
这是仅使用查询的解决方案:
SELECT SUM(total_count) as total, value
FROM (
SELECT count(*) AS total_count, REPLACE(REPLACE(REPLACE(x.value,'?',''),'.',''),'!','') as value
FROM (
SELECT SUBSTRING_INDEX(SUBSTRING_INDEX(t.sentence, ' ', n.n), ' ', -1) value
FROM table_name t CROSS JOIN
(
SELECT a.N + b.N * 10 + 1 n
FROM
(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) a
,(SELECT 0 AS N UNION ALL SELECT 1 UNION ALL SELECT 2 UNION ALL SELECT 3 UNION ALL SELECT 4 UNION ALL SELECT 5 UNION ALL SELECT 6 UNION ALL SELECT 7 UNION ALL SELECT 8 UNION ALL SELECT 9) b
ORDER BY n
) n
WHERE n.n <= 1 + (LENGTH(t.sentence) - LENGTH(REPLACE(t.sentence, ' ', '')))
ORDER BY value
) AS x
GROUP BY x.value
) AS y
GROUP BY value
Run Code Online (Sandbox Code Playgroud)
这是完整的工作小提琴:http ://sqlfiddle.com/#!2/ 17481a/1
首先,我们做一个查询,提取所有的字作为解释这里的@peterm(按照他的指示,如果你想自定义处理的字的总数)。然后我们将其转换为子查询,然后我们 COUNT和GROUP BY每个单词的值,然后在此基础上进行另一个查询,以GROUP BY不将可能存在伴随符号的单词分组。即:你好=你好!与REPLACE