计算表的行大小和最大行大小

Naw*_*ail 12 mysql innodb database-design information-schema

问题:

有什么方法可以计算表创建占用的字节数,我知道您可以从information_schema.tables 中获取一些信息,但该信息不够准确。

实际需要的只是innodb表定义的字节数,collat​​ion也可以认为是utf-8-general-ci

例如表测试如下

创建表测试

col1 varchar(25),
col2 int,
col3 varchar(3),
col4 char(15),
col5 datetime
);

现在需要知道根据表中列的类型可以在一行中累积的总行大小。

在 MSSQL 中找到了某种类似的解决方案,但需要它的 MySQL 版本

用于估计任何表的行大小的脚本

任何帮助深表感谢。

nai*_*ian 10

这是我根据 MySQL 文档“数据类型存储要求”使用 SQL 并查询 INFORMATION_SCHEMA.COLUMNS 表的两分钱。

注意:这并不完全准确,因为 ENUM 和 SET 列的大小分别取决于枚举值和集合成员的数量。这里我使用了“最坏情况”值(分别为 2 和 8 字节),这已经给出了有关行大小的良好提示。人们可以解析 INFORMATION_SCHEMA.COLUMNS.COLUMN_TYPE 以获取此类列的枚举值/集成员的数量,然后根据文档所述计算字节数。

SELECT col_sizes.TABLE_SCHEMA, col_sizes.TABLE_NAME, SUM(col_sizes.col_size) AS EST_MAX_ROW_SIZE
FROM (
    SELECT 
        cols.TABLE_SCHEMA, 
        cols.TABLE_NAME, 
        cols.COLUMN_NAME,
        CASE cols.DATA_TYPE
            WHEN 'tinyint' THEN 1
            WHEN 'smallint' THEN 2
            WHEN 'mediumint' THEN 3
            WHEN 'int' THEN 4
            WHEN 'bigint' THEN 8
            WHEN 'float' THEN IF(cols.NUMERIC_PRECISION > 24, 8, 4)
            WHEN 'double' THEN 8
            WHEN 'decimal' THEN ((cols.NUMERIC_PRECISION - cols.NUMERIC_SCALE) DIV 9)*4  + (cols.NUMERIC_SCALE DIV 9)*4 + CEIL(MOD(cols.NUMERIC_PRECISION - cols.NUMERIC_SCALE,9)/2) + CEIL(MOD(cols.NUMERIC_SCALE,9)/2)
            WHEN 'bit' THEN (cols.NUMERIC_PRECISION + 7) DIV 8
            WHEN 'year' THEN 1
            WHEN 'date' THEN 3
            WHEN 'time' THEN 3 + CEIL(cols.DATETIME_PRECISION /2)
            WHEN 'datetime' THEN 5 + CEIL(cols.DATETIME_PRECISION /2)
            WHEN 'timestamp' THEN 4 + CEIL(cols.DATETIME_PRECISION /2)
            WHEN 'char' THEN cols.CHARACTER_OCTET_LENGTH
            WHEN 'binary' THEN cols.CHARACTER_OCTET_LENGTH
            WHEN 'varchar' THEN IF(cols.CHARACTER_OCTET_LENGTH > 255, 2, 1) + cols.CHARACTER_OCTET_LENGTH
            WHEN 'varbinary' THEN IF(cols.CHARACTER_OCTET_LENGTH > 255, 2, 1) + cols.CHARACTER_OCTET_LENGTH
            WHEN 'tinyblob' THEN 9
            WHEN 'tinytext' THEN 9
            WHEN 'blob' THEN 10
            WHEN 'text' THEN 10
            WHEN 'mediumblob' THEN 11
            WHEN 'mediumtext' THEN 11
            WHEN 'longblob' THEN 12
            WHEN 'longtext' THEN 12
            WHEN 'enum' THEN 2
            WHEN 'set' THEN 8
            ELSE 0
        END AS col_size
    FROM INFORMATION_SCHEMA.COLUMNS cols
) AS col_sizes
GROUP BY col_sizes.TABLE_SCHEMA, col_sizes.TABLE_NAME
Run Code Online (Sandbox Code Playgroud)


Naw*_*ail 4

经过大量思考和研究后,找到了一个真正有助于实现所需目标的答案。这是一个 perl 脚本,参考链接是

http://dev.mysql.com/doc/refman/5.6/en/storage-requirements.html

#!/usr/bin/perl
use strict;
$| = 1;

my %DataType = (
"TINYINT"=>1, "SMALLINT"=>2, "MEDIUMINT"=>3, "INT"=>4, "INTEGER"=>4, "BIGINT"=>8,
"FLOAT"=>'$M<=24?4:8', "DOUBLE"=>8,
"DECIMAL"=>'int(($M-$D)/9)*4+int(((($M-$D)%9)+1)/2)+int($D/9)*4+int((($D%9)+1)/2)',
"NUMERIC"=>'int(($M-$D)/9)*4+int(((($M-$D)%9)+1)/2)+int($D/9)*4+int((($D%9)+1)/2)',
"BIT"=>'($M+7)>>3',
"DATE"=>3, "TIME"=>3, "DATETIME"=>8, "TIMESTAMP"=>4, "YEAR"=>1,
"BINARY"=>'$M',"CHAR"=>'$M*$CL',
"VARBINARY"=>'$M+($M>255?2:1)', "VARCHAR"=>'$M*$CL+($M>255?2:1)',
"ENUM"=>'$M>255?2:1', "SET"=>'($M+7)>>3',
"TINYBLOB"=>9, "TINYTEXT"=>9,
"BLOB"=>10, "TEXT"=>10,
"MEDIUMBLOB"=>11, "MEDIUMTEXT"=>11,
"LONGBLOB"=>12, "LONGTEXT"=>12
);

my %DataTypeMin = (
"VARBINARY"=>'($M>255?2:1)', "VARCHAR"=>'($M>255?2:1)'
);

my ($D, $M, $S, $C, $L, $dt, $dp ,$bc, $CL);
my $fieldCount = 0;
my $byteCount = 0;
my $byteCountMin = 0;
my @fields = ();
my $fieldName;
my $tableName;
my $defaultDbCL = 1;
my $defaultTableCL = 1;
my %charsetMaxLen;
my %collationMaxLen;

open (CHARSETS, "mysql -B --skip-column-names information_schema -e 'select CHARACTER_SET_NAME,MAXLEN from CHARACTER_SETS;' |");
%charsetMaxLen = map ( ( /^(\w+)/ => /(\d+)$/ ), <CHARSETS>);
close CHARSETS;

open (COLLATIONS, "mysql -B --skip-column-names information_schema -e 'select COLLATION_NAME,MAXLEN from CHARACTER_SETS INNER JOIN COLLATIONS USING(CHARACTER_SET_NAME);' |");
%collationMaxLen = map ( ( /^(\w+)/ => /(\d+)$/ ), <COLLATIONS>);
close COLLATIONS;

open (TABLEINFO, "mysqldump -d --compact ".join(" ",@ARGV)." |");

while (<TABLEINFO>) {
chomp;
if ( ($S,$C) = /create database.*?`([^`]+)`.*default\scharacter\sset\s+(\w+)/i ) {
$defaultDbCL = exists $charsetMaxLen{$C} ? $charsetMaxLen{$C} : 1;
print "Database: $S".($C?" DEFAULT":"").($C?" CHARSET $C":"")." (bytes per char: $defaultDbCL)\n\n";
next;
}
if ( /^create table\s+`([^`]+)`.*/i ) {
$tableName = $1;
@fields = ();
next;
}
if ( $tableName && (($C,$L) = /^\)(?:.*?default\scharset=(\w+))?(?:.*?collate=(\w+))?/i) ) {
$defaultTableCL = exists $charsetMaxLen{$C} ? $charsetMaxLen{$C} : (exists $collationMaxLen{$L} ? $collationMaxLen{$L} : $defaultDbCL);
print "Table: $tableName".($C||$L?" DEFAULT":"").($C?" CHARSET $C":"").($L?" COLLATION $L":"")." (bytes per char: $defaultTableCL)\n";
$tableName = "";
$fieldCount = 0;
$byteCount = 0;
$byteCountMin = 0;
while ($_ = shift @fields) {
if ( ($fieldName,$dt,$dp,$M,$D,$S,$C,$L) = /\s\s`([^`]+)`\s+([a-z]+)(\((\d+)(?:,(\d+))?\)|\((.*)\))?(?:.*?character\sset\s+(\w+))?(?:.*?collate\s+(\w+))?/i ) {
$dt = uc $dt;
if (exists $DataType{$dt}) {
if (length $S) {
$M = ($S =~ s/(\'.*?\'(?!\')(?=,|$))/$1/g);
$dp = "($M : $S)"
}
$D = 0 if !$D;
$CL = exists $charsetMaxLen{$C} ? $charsetMaxLen{$C} : (exists $collationMaxLen{$L} ? $collationMaxLen{$L} : $defaultTableCL);
$bc = eval($DataType{$dt});
$byteCount += $bc;
$byteCountMin += exists $DataTypeMin{$dt} ? $DataTypeMin{$dt} : $bc;
} else {
$bc = "??";
}
$fieldName.="\t" if length($fieldName) < 8;
print "bytes:\t".$bc."\t$fieldName\t$dt$dp".($C?" $C":"").($L?" COLL $L":"")."\n";
++$fieldCount;
}
}
print "total:\t$byteCount".($byteCountMin!=$byteCount?"\tleast: $byteCountMin":"\t\t")."\tcolumns: $fieldCount\n\n";
next;
}
push @fields, $_;
}
close TABLEINFO;
Run Code Online (Sandbox Code Playgroud)

感谢大家的大力帮助。