标签: query-performance

使用“NOT IN”时性能不佳

在我的应用程序中，我有一个在“文件”表中执行搜索的查询。

该表files按f.created（参见表定义）进行分区，并为客户端 19 ( f.cid = 19)提供约 1 亿行。

我正在使用我上一个问题的答案中的这个查询， SQL Server 的慢顺序：

WITH PartitionNumbers AS
(
    -- Each partition of the table
    SELECT P.partition_number
    FROM sys.partitions AS P
    WHERE P.[object_id] = OBJECT_ID(N'dbo.files', N'U')
    AND P.index_id = 1
)
SELECT
    FF.id,
    FF.[name],
    FF.[year],
    FF.cid,
    FF.created,
    vnVE0.keywordValueCol0_numeric
FROM PartitionNumbers AS PN
CROSS APPLY
(
    SELECT
        F100.*
    FROM 
    (
        -- 50 rows in order for year 2013
        SELECT
            F.id,
            F.[name],
            F.[year],
            F.cid,
            F.created
        FROM dbo.files …

Run Code Online (Sandbox Code Playgroud)

performance sql-server t-sql query-performance

RuS*_*SSe

2020 06-15

6
推荐指数

2
解决办法

535
查看次数

异常的列比较和查询性能

我们有一些顾问致力于扩展内部数据仓库。我正在做代码审查并在所有加载过程中遇到这种模式：

    MERGE [EDHub].[Customer].[Class] AS TARGET
    USING (
        SELECT <columns>
        FROM [dbo].[vw_CustomerClass]
            WHERE JHAPostingDate = @PostingDate   
        ) AS SOURCE
        ON  TARGET.BankId = SOURCE.BankId       -- This join is on the business keys
            AND TARGET.Code = SOURCE.Code
    WHEN NOT MATCHED BY TARGET  
        THEN
            <INSERT Statement>
    WHEN MATCHED
        AND TARGET.IsLatest = 1
        AND EXISTS (
            SELECT SOURCE.[HASH]   
            EXCEPT          
            SELECT TARGET.[Hash]
            )
        THEN 
            <UPDATE Statement>

Run Code Online (Sandbox Code Playgroud)

要点是，如果我们有一个新的业务键，则插入，但如果业务键存在并且属性的散列与我们的当前行不匹配，则更新旧行并插入一个新行（稍后在代码中）。一切正常，但是当我看到这段代码时我暂停了

AND EXISTS (
            SELECT SOURCE.[HASH]   
            EXCEPT          
            SELECT TARGET.[Hash]
            )

Run Code Online (Sandbox Code Playgroud)

与 SOURCE.[HASH] <> TARGET.[Hash] 相比，它似乎过于复杂。EXCEPT 将进行准确的 NULL 比较，但在我们的情况下，哈希值永远不会为 NULL（或者我们有更大的问题）。我希望我们的代码易于阅读，这样当有人必须维护它时，它不会混淆。我向我们的顾问询问了它，他们推测它可能会因为集合操作而更快，但我决定编写一个简单的测试（下面的测试代码）。

我注意到的第一件事是 …

performance sql-server sql-server-2016 except query-performance

Bob*_*bst

2020 01-08

6
推荐指数

1
解决办法

415
查看次数

为什么这个查询需要这么长时间才能执行？

我目前Microsoft SQL Azure (RTM) - 12.0.2000.8用作数据库。该数据库目前有 10 个 DTU。

这里的想法是我想根据视图进行查询。此视图包含SELECT()具有 5 个表连接的简单语法。该视图在大约 6 秒内给出大约 253K 行的输出。

CREATE VIEW [dbo].[TopAdsDisplaySumaryView]
AS
SELECT  
        client.Id AS ClientId,      -- (PK, int, not null)
        client.PartnerId,           -- (FK, int, not null)
        adsPict.Id AS AdsPictureId, -- (PK, int, not null)
        adsPict.ImageName,          -- (nvarchar(max), null)
        displayAds.DisplayTo,       -- (datetime, not null)
        displayAds.DisplayFrom      -- (datetime, not null)
FROM      
        dbo.Machines AS machine 
        INNER JOIN dbo.MachineGroups AS machineGroups ON machineGroups.Id = machine.MachineGroupId 
        INNER JOIN dbo.Clients AS client ON …

Run Code Online (Sandbox Code Playgroud)

performance sql-server azure-sql-database query-performance

Rey*_*ldi

2020 01-08

6
推荐指数

1
解决办法

418
查看次数

优化位图堆扫描

我试图理解为什么我的查询需要很长时间，即使我已经索引了所需的列：

SELECT entity_id,
       id,
       report_date
FROM own_inst_detail
WHERE ( own_inst_detail.id = 'P7M7WC-S' )
  AND ( own_inst_detail.report_date >= '2017-02-01T17:29:49.661Z' )
  AND ( own_inst_detail.report_date <= '2018-08-01T17:29:49.663Z' )

Run Code Online (Sandbox Code Playgroud)

缓存结果EXPLAIN ANALYZE如下：

Bitmap Heap Scan on own_inst_detail (cost=20.18..2353.55 rows=597 width=22) (actual time=1.471..6.955 rows=4227 loops=1)
  Recheck Cond: ((id = 'P7M7WC-S'::bpchar) AND (report_date >= '2017-06-01'::date) AND (report_date <= '2018-08-01'::date))
  Heap Blocks: exact=4182
  ->  Bitmap Index Scan on own_inst_detail  (cost=0.00..20.03 rows=597 width=0) (actual time=0.901..0.901 rows=4227 loops=1)
        Index Cond: ((id = 'P7M7WC-S'::bpchar) AND (report_date >= '2017-06-01'::date) AND …

Run Code Online (Sandbox Code Playgroud)

postgresql performance optimization query-performance postgresql-performance

ins*_*ide

2020 01-08

6
推荐指数

1
解决办法

9102
查看次数

对于我的大表来说，Postgresql 查询非常慢

我的数据库版本是postgresql 9.5。

create table if not exists request_log
(
    id               bigserial not null constraint app_requests_pkey  primary key,
    request_date     timestamp not null,
    ip               varchar(50),
    start_time       timestamp,
    application_name varchar(200),
    request_path     text,
    display_url      text,
    username         varchar(50)
);

Run Code Online (Sandbox Code Playgroud)

我有一个包含传入 http 请求信息的表。该id列是主键和索引。表没有关系。

所以这个表中有 72320081 行。当我运行计数查询来获取表的计数时，select count(id) from request_log;查询需要 3-5 分钟。

explain(analyze, buffers, format text)该请求的结果是：

Aggregate  (cost=3447214.71..3447214.72 rows=1 width=0) (actual time=135575.947..135575.947 rows=1 loops=1)
  Buffers: shared hit=96 read=2551303
  ->  Seq Scan on request_log  (cost=0.00..3268051.57 rows=71665257 width=0) (actual time=2.517..129032.408 rows=72320081 loops=1)
        Buffers: shared …

Run Code Online (Sandbox Code Playgroud)

postgresql performance vmware postgresql-9.5 query-performance

bar*_*oma

2020 01-08

6
推荐指数

1
解决办法

2万
查看次数

有没有更好的方法来处理多级 ParentId 表结构？

我在一家出版商工作，我们的产品主要是书籍和期刊。它们最常见的结构如下：

Book > Chapter

Book Series > Book > Chapter

Book > Volume > Chapter

Book Series > Book > Volume > Chapter

Journal > Volume > Issue > Article

Journal > Volume > Article

Run Code Online (Sandbox Code Playgroud)

我们目前将所有这些记录与 Id 和 ParentId 列存储在同一个表中。例如，TitleId = 1 的书有 3 章将具有以下行：

Book: Id = 1, ParentId = 1
Chapter #1: Id = 2, ParentId = 1
Chapter #2: Id = 3, ParentId = 1
Chapter #3: Id = 4, ParentId = 1

Run Code Online (Sandbox Code Playgroud)

所有这些记录，无论是书籍、章节、期刊、文章等，都可以将它们的 Id 连接到其他表，以获取作者、价格、所有权等信息。

这种结构给我们带来的问题是嵌套在某些情况下会增加大量开销。例如，如果有人试图访问他们购买的期刊文章，我们需要运行多个查询来了解他们是否确实有权访问。我们有一个包含自有产品 …

performance database-design sql-server query-performance

Str*_*ped

lucky-day

6
推荐指数

2
解决办法

488
查看次数

是否可以在只读可用性组辅助上使用“读取未提交”隔离级别？

我们在 SQL 2019 企业版中使用可用性组。我们使用企业功能允许 AG 辅助节点处于只读模式，然后通过使用参数连接到侦听器来对辅助节点运行报告查询ApplicationIntent=ReadOnly。

Read Uncommitted出于锁定和性能原因，我们有一些在主数据库上使用隔离级别运行的查询。

似乎在辅助设备上，所有隔离级别都转换为 RCSI，无论指定的锁定/隔离级别如何 - 可能是因为没有可能阻止 AG 同步的锁至关重要。

是否可以在辅助数据库上以“读取未提交”的方式运行查询，这大概也可以确保不采取任何锁定，但是，在某些情况下可以执行得更好，或者对于只读辅助数据库上的查询，它始终必须是 RCSI？

availability-groups snapshot-isolation sql-server-2019 query-performance

Mar*_*ark

lucky-day

6
推荐指数

1
解决办法

1082
查看次数

pgAdmin 是否增加了查询时间开销？

我刚刚写了一个很长的问题，关于我如何优化一个相当简单的查询，该查询花费的时间比我希望的要长得多。我一直向pgAdmin查询。然后我使查询变得越来越简单，直到我最终查询一个新创建的表的主键，其中只有 1 行。

create table perf_test (id bigint primary key);

Run Code Online (Sandbox Code Playgroud)

然后查询：

select
  count(t.id)
from
  perf_test t
where
  t.id = 1
;

Run Code Online (Sandbox Code Playgroud)

消息输出为：

Successfully run. Total query runtime: 66 msec.
1 rows affected.

Run Code Online (Sandbox Code Playgroud)

我需要优化一个查询，该查询从我的应用程序发出时通常需要大约 30-40 毫秒。如果 pgAdmin 中最简单的查询的执行时间已经长得多，我该如何实验和测量性能？

postgresql pgadmin query-performance

Eas*_*her

2022 09-06

6
推荐指数

2
解决办法

2004
查看次数

如何通过许多重复的 UNION 子查询来减少查询大小？

我使用 Postgres 13 并使用以下 DDL 定义了一个表：

CREATE TABLE item_codes (
    code    bytea                    NOT NULL,
    item_id bytea                    NOT NULL,
    time    TIMESTAMP WITH TIME ZONE NOT NULL,
    PRIMARY KEY (item_id, code)
);

CREATE INDEX ON item_codes (code, time, item_id);

Run Code Online (Sandbox Code Playgroud)

我使用以下查询：

SELECT DISTINCT time, item_id
FROM (
      (SELECT time, item_id
       FROM item_codes
       WHERE code = '\x3965623166306238383033393437613338373162313934383034366139653239'
       ORDER BY time, item_id
       LIMIT 100)
       UNION ALL
      (SELECT time, item_id
       FROM item_codes
       WHERE code = '\x3836653432356638366638636338393364373935343938303233343363373561'
       ORDER BY time, item_id
       LIMIT 100)
     ) AS items
ORDER …

Run Code Online (Sandbox Code Playgroud)

postgresql execution-plan union query-performance postgresql-performance

Vit*_*nko

2023 02-22

6
推荐指数

1
解决办法

608
查看次数