我正在研究 PostgreSQL 的lateral连接,特别是在 group by/limit 上执行连接。
当我查找单个记录时,该查询工作得非常好,但当我们查询多个记录时,性能很快就会下降。这是有道理的,因为我们有多个子查询运行单独的收集、过滤聚合、排序。问题是,我们应该考虑什么 Postgres 策略,或者我们如何重构下面的查询以使其在规模上具有性能?
我们有三个主表,其中两个表之间有一个连接表:
|经理| >- |商店| >- |商店_产品| -< 产品
我们拥有给定商店记录的所有历史经理,并且我们拥有商店的完整产品目录(产品可能由多个商店销售)。
目标:给定一个商店 ID,查询最近的经理和最近销售的产品。
这是从商店到经理和产品的内部联接。Manager & Product 必须按日期 desc 排序并限制为 1(至少我相信这是获取最新日期的方法)。
SELECT
store.id as store_id,
manager.id as manager_id,
*
FROM
Stores as store,
LATERAL (
SELECT
*
FROM
Products as product
INNER JOIN Stores_Products store_product on store_product.product_id = product.id
WHERE
store_product.store_id = store.id
ORDER BY
store.date desc
LIMIT 1
) p,
LATERAL (
SELECT
* …Run Code Online (Sandbox Code Playgroud) 我希望提高从表中选择几列的查询的性能。想知道限制列数是否会对查询性能产生任何影响。
我有两组来自外部来源的数据 - 客户的购买日期和客户的上次电子邮件点击/开放日期.它分别存储在两个表PURCHASE_INTER和ACTIVITY_INTER表中.购买数据是多个,我需要选择上次购买日期.但活动数据对每个客户都是独一无二的.数据彼此独立,并且可能不存在其他数据集.我们在下面写了一个查询,它结合了两个表,根据person_id对它们进行分组,这是来自外部来源的客户的ID并获取最新的日期,加入我们的客户表以获取客户电子邮件,然后再次加入另一个表最终存储此数据的目的是为了知道它是否是插入或更新操作.您能否建议我如何提高此查询的性能.它非常慢,耗时超过10小时.PURCHASE_INTER和ACTIVITY_INTER表中有数百万条记录.
SELECT INTER.*, C.ID AS CUSTOMER_ID, C.EMAIL AS CUSTOMER_EMAIL, LSI.ID AS INTERACTION_ID, ROW_NUMBER() OVER (ORDER BY PERSON_ID ASC) AS RN FROM (
SELECT PERSON_ID AS PERSON_ID,
MAX(LAST_CLICK_DATE) AS LAST_CLICK_DATE,
MAX(LAST_OPEN_DATE) AS LAST_OPEN_DATE,
MAX(LAST_PURCHASE_DATE) AS LAST_PURCHASE_DATE
FROM (
SELECT ACT.PERSON_ID AS PERSON_ID,
ACT.LAST_CLICK_DATE AS LAST_CLICK_DATE,
ACT.LAST_OPEN_DATE AS LAST_OPEN_DATE,
NULL AS LAST_PURCHASE_DATE
FROM ACTIVITY_INTER ACT
WHERE ACT.JOB_ID = 77318317
UNION
SELECT PUR.PERSON_ID AS PERSON_ID,
NULL AS LAST_CLICK_DATE,
NULL AS LAST_OPEN_DATE,
PUR.LAST_PURCHASE_DATE AS LAST_PURCHASE_DATE
FROM PURCHASE_INTER PUR
WHERE PUR.JOB_ID = 77318317 …Run Code Online (Sandbox Code Playgroud) 几个月前我写了这个查询.它运行良好.但是,这种查询日复一日地执行得越来越慢.
除了相同的表执行,此查询还检查多个表中的帐单历史记录.
这是查询 -
SELECT * FROM (
SELECT *,
(SELECT username FROM users WHERE id = u_bills.UserId) AS username,
(SELECT first_name FROM users WHERE id = u_bills.UserId) AS first_name,
(SELECT last_name FROM users WHERE id = u_bills.UserId) AS last_name,
(SELECT phone FROM users WHERE id = u_bills.UserId) AS phone,
(SELECT email FROM users WHERE id = u_bills.UserId) AS email,
(SELECT CPRate FROM cpt WHERE UserId = u_bills.UserId ORDER BY AddedDate DESC LIMIT 0,1) AS cprate,
(SELECT (SELECT PopName FROM …Run Code Online (Sandbox Code Playgroud) CREATE TABLE `products` (
`id` int(11) NOT NULL AUTO_INCREMENT,
`name` varchar(255) DEFAULT NULL,
`brand` int(11) DEFAULT NULL,
`shown` tinyint(4) DEFAULT '1',
PRIMARY KEY (`id`),
KEY `fk_products_brandId_idx` (`brand`),
KEY `pk_products_shown` (`shown`),
CONSTRAINT `fk_products_brandId` FOREIGN KEY (`brand`) REFERENCES `brands` (`id`) ON DELETE NO ACTION ON UPDATE NO ACTION,
CONSTRAINT `fk_products_categoryId` FOREIGN KEY (`category`) REFERENCES `categories` (`id`) ON DELETE SET NULL ON UPDATE CASCADE
) ENGINE=InnoDB AUTO_INCREMENT=8 DEFAULT CHARSET=utf8;
Run Code Online (Sandbox Code Playgroud)
我有这个“显示”字段,用于标记已删除的记录(如果用户删除记录,则不会删除;这些字段设置为值 0)。
所以我所有的查询或多或少都是这样的:
"SELECT * FROM products WHERE brand = 1 AND shown = …Run Code Online (Sandbox Code Playgroud) 简单的SQL Server问题..哪个更快:
....为什么?
CREATE TABLE dbo.myTable
(
Id int CONSTRAINT PK_myTable_Id PRIMARY KEY,
Name varchar(200) NULL
)
GO
INSERT INTO dbo.myTable(Id) VALUES (1);
INSERT INTO dbo.myTable(Id, Name) VALUES (2, NULL);
GO
Run Code Online (Sandbox Code Playgroud)
请提供参考或基准(以便您的答案不仅仅是一个意见).
谢谢.
PS:我可以运行2个大型循环,并比较总时间,但它仍然不会告诉我为什么..
使用批量收集或常规合并更新?
我试图通过使用批量收集和正常合并来检查更新的性能。我看到当我们在匿名块中使用简单合并时,性能会更好。使用批量收集时,需要花费更多时间。
如果正常更新(合并)比批量收集快,那么为什么Oracle引入了它?我们在哪里真正看到批量收集的好处?
declare
l_start integer;
l_end integer;
begin
l_start := dbms_utility.get_time;
merge into test111 t1
using test112 t2
on (t1.col1 = t2.col3)
when matched then update
set t1.col2 = t1.col2*5;
l_end := dbms_utility.get_time;
dbms_output.put_line(l_end - l_start);
end;
Run Code Online (Sandbox Code Playgroud)
declare
type nt_test is table of test112.col3%TYPE;
nt_val nt_test := nt_test();
cursor c is select col3 from test112;
c_limit integer := 100;
l_start integer;
l_end integer;
begin
l_start := DBMS_UTILITY.get_time;
open c;
loop
fetch c
bulk collect into nt_val limit c_limit;
exit …Run Code Online (Sandbox Code Playgroud) 我想在SQL中查询大型数据集 - 这是正确的方法吗?
declare @datetime Datetime
select *
from sales
where salesdate <= @datetme
Run Code Online (Sandbox Code Playgroud)
要么:
select *
from sales
where salesdate < (select GetDate())
Run Code Online (Sandbox Code Playgroud)
要么:
select *
from sales
where salesdate < GetDate()
Run Code Online (Sandbox Code Playgroud)
或者通过使用 NOW()
sql ×7
indexing ×2
mysql ×2
performance ×2
postgresql ×2
sql-server ×2
bulkinsert ×1
database ×1
io ×1
lateral ×1
oracle ×1
oracle11g ×1
plsql ×1
sql-tuning ×1