我正在通过读取文件进行大量插入.该文件看起来像,
sampletext1
sampletext2
..........
..........
sampletextN
Run Code Online (Sandbox Code Playgroud)
文件中有数百万行,大小约为3 GB.将每一行读取到变量然后执行单个INSERT将不起作用,因为我只有大约2 GB的RAM.
我逐行阅读并创建mysql INSERT字符串.当代码读取5000行时,我将它们插入到DB中,因此INSERT中将有5000条记录.我的代码中的MySQL查询(INSERT IGNORE INTO $ curr VALUES $ string)像往常一样运行,直到大约25000行被读取并插入,但随后它减慢并且大约需要5-10秒才能进行一次INSERT.我认为随着记录的增加它会线性减少.
Perl代码段:
sub StoreToDB {
my $self = shift;;
$self->_doPreliminary();
my $data_struc = $self->_getDATA();
my $file = $data_struc->{DOMAIN_FILE};
my ($count,$cnt,$string,$curr) = (0,0,'',$self->_getTLD() . '_current');
open FH,$file or ( FullLogger($self->_getTLD(),"Cant open $file from StoreToDB : $!\n") and return );
$self->_dbConnect();
while (<FH>) {
chomp;
if ( $cnt == MAX ) {
$self->_dbExecute("INSERT IGNORE INTO $curr VALUES $string");
$count += $cnt;
$cnt = 0;
$string = '';
Logger("Inside StoreToDB, count is : $count ***\n");
}
$string .= "('" . $_ . "')";
++$cnt;
$string = ($cnt != MAX ? $string . ',' : $string . ';');
}#while
close FH;
$self->_dbDisconnect();
return 1;
}#StoreToDB
==============================
DB table details :
mysql> SHOW CREATE TABLE com_current;
+-------------+-------------------------------------------------------------------------------------------------------------------------------+
| Table | Create Table |
+-------------+-------------------------------------------------------------------------------------------------------------------------------+
| com_current | CREATE TABLE `com_current` (
`domain` varchar(60) NOT NULL,
PRIMARY KEY (`domain`)
) ENGINE=MyISAM DEFAULT CHARSET=latin1 |
+-------------+-------------------------------------------------------------------------------------------------------------------------------+
1 row in set (16.60 sec)
mysql>
Run Code Online (Sandbox Code Playgroud)
MySQL状态输出:
Uptime: 1057 Threads: 2 Questions: 250 Slow queries: 33 Opens: 38 Flush tables: 1 Open tables: 28 Queries per second avg: 0.236
Run Code Online (Sandbox Code Playgroud)
================================================== =============更新:
到目前为止,我已经尝试了以下方法,但没有一个更好:
1) LOCK TABLES my_table WRITE;
then after inserting, I unlock it,
UNLOCK TABLES;
2) INSERT DELAYED IGNORE INTO $curr VALUES $string
3) LOAD DATA INFILE '$file' IGNORE INTO TABLE $curr
this is currently in progress, but seems worse than the original method.
Run Code Online (Sandbox Code Playgroud)
我不知道我的my.cnf是否有任何问题.所以我在这里贴了它.
[client]
port = 3306
socket = /tmp/mysql.sock
[mysqld]
datadir = /mnt/mysql/data
port = 3306
socket = /tmp/mysql.sock
skip-external-locking
key_buffer_size = 16M
max_allowed_packet = 1M
table_open_cache = 64
sort_buffer_size = 512K
net_buffer_length = 8K
read_buffer_size = 256K
read_rnd_buffer_size = 512K
myisam_sort_buffer_size = 8M
log-bin=mysql-bin
binlog_format=mixed
server-id = 1
[mysqldump]
quick
max_allowed_packet = 16M
[mysql]
no-auto-rehash
[myisamchk]
key_buffer_size = 20M
sort_buffer_size = 20M
read_buffer = 2M
write_buffer = 2M
[mysqlhotcopy]
interactive-timeout
Run Code Online (Sandbox Code Playgroud)