Pra*_*ant 7 postgresql insertion
我有一个要求,我需要将记录以10,000记录/秒的速率存储到数据库中(在几个字段上编制索引).一条记录中的列数为25.我在一个事务块中进行100,000条记录的批量插入.为了提高插入率,我将表空间从磁盘更改为RAM.因此我每秒只能实现5,000次插入.
我还在postgres配置中做了以下调整:
其他信息:
我想知道为什么当数据库没有在磁盘上写任何东西时,单个插入查询平均需要大约0.2毫秒(因为我使用的是基于RAM的表空间).有什么我做错了吗?
帮助赞赏.
PRASHANT
Dav*_*vis 16
\COPY schema.temp_table FROM /tmp/data.csv WITH CSV
对于大量数据:
SELECT
语句将使用的列的顺序插入它.换句话说,尝试将物理模型与逻辑模型对齐.CLUSTER
索引(左侧最重要的列).例如:CREATE UNIQUE INDEX measurement_001_stc_index ON climate.measurement_001 USING btree (station_id, taken, category_id); ALTER TABLE climate.measurement_001 CLUSTER ON measurement_001_stc_index;
在一台4GB内存的机器上,我做了以下......
告诉内核程序可以使用大量共享内存:
sysctl -w kernel.shmmax=536870912
sysctl -p /etc/sysctl.conf
Run Code Online (Sandbox Code Playgroud)
/etc/postgresql/8.4/main/postgresql.conf
和设置:shared_buffers = 1GB temp_buffers = 32MB work_mem = 32MB maintenance_work_mem = 64MB seq_page_cost = 1.0 random_page_cost = 2.0 cpu_index_tuple_cost = 0.001 effective_cache_size = 512MB checkpoint_segments = 10
例如,假设您有基于天气的数据,分为不同的类别.而不是只有一个怪异的表,将它分成几个表(每个类别一个).
CREATE TABLE climate.measurement
(
id bigserial NOT NULL,
taken date NOT NULL,
station_id integer NOT NULL,
amount numeric(8,2) NOT NULL,
flag character varying(1) NOT NULL,
category_id smallint NOT NULL,
CONSTRAINT measurement_pkey PRIMARY KEY (id)
)
WITH (
OIDS=FALSE
);
Run Code Online (Sandbox Code Playgroud)
CREATE TABLE climate.measurement_001
(
-- Inherited from table climate.measurement_001: id bigint NOT NULL DEFAULT nextval('climate.measurement_id_seq'::regclass),
-- Inherited from table climate.measurement_001: taken date NOT NULL,
-- Inherited from table climate.measurement_001: station_id integer NOT NULL,
-- Inherited from table climate.measurement_001: amount numeric(8,2) NOT NULL,
-- Inherited from table climate.measurement_001: flag character varying(1) NOT NULL,
-- Inherited from table climate.measurement_001: category_id smallint NOT NULL,
CONSTRAINT measurement_001_pkey PRIMARY KEY (id),
CONSTRAINT measurement_001_category_id_ck CHECK (category_id = 1)
)
INHERITS (climate.measurement)
WITH (
OIDS=FALSE
);
Run Code Online (Sandbox Code Playgroud)
Bump up重要列的表统计信息:
ALTER TABLE climate.measurement_001 ALTER COLUMN taken SET STATISTICS 1000;
ALTER TABLE climate.measurement_001 ALTER COLUMN station_id SET STATISTICS 1000;
Run Code Online (Sandbox Code Playgroud)
不要忘记VACUUM
和ANALYSE
事后.
你正在做一系列的插入吗?
INSERT INTO tablename (...) VALUES (...);
INSERT INTO tablename (...) VALUES (...);
...
Run Code Online (Sandbox Code Playgroud)
或作为一个多行插入:
INSERT INTO tablename (...) VALUES (...),(...),(...);
Run Code Online (Sandbox Code Playgroud)
第二个在100k行上会更快.
来源:http://kaiv.wordpress.com/2007/07/19/faster-insert-for-multiple-rows/
归档时间: |
|
查看次数: |
9028 次 |
最近记录: |