确定 IP 是否在 IPv4 CIDR 块内

RMa*_*his 6 mysql index datatypes range-types

确定 IP 是否包含在 CIDR 块中的最快方法是什么?

目前,每当我存储 CIDR 地址时,我还会为起始和结束 IP 地址创建两列。开始和结束 ip 地址已编入索引。如果我想查看哪个网络包含地址,那么我看起来where ip between start_ip and end_ip似乎不太理想。

在我看来,我可以存储正确移位的数字,并且可以匹配类似移位的 IP 地址(@cidr 的情况下为 660510)...

select @cidr, inet_aton(substring_index(@cidr,'/',1))>>(32-substring_index(@cidr,'/',-1));
+---------------+-----------------------------------------------------------------------------+
| @cidr         | inet_aton(substring_index(@cidr,'/',1))>>(32-substring_index(@cidr,'/',-1)) |
+---------------+-----------------------------------------------------------------------------+
| 10.20.30.0/24 |                                                                      660510 |
+---------------+-----------------------------------------------------------------------------+
1 row in set (0.00 sec)

set @ip:='10.20.30.40';
Query OK, 0 rows affected (0.00 sec)

select @ip, inet_aton(@ip)>>(32-substring_index(@cidr,'/',-1));
+-------------+----------------------------------------------------+
| @ip         | inet_aton(@ip)>>(32-substring_index(@cidr,'/',-1)) |
+-------------+----------------------------------------------------+
| 10.20.30.40 |                                             660510 |
+-------------+----------------------------------------------------+
1 row in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)

为了以索引方式从中受益,我需要知道子网掩码(要移动的位数)。否则,我要么系统地比较位移位(即,盲目位移每个可能的网络掩码(从 0 位到 24 位))。

我还有其他资源需要优化,但优化位于http://lite.ip2location.com/database/ip-asn的 IP2Location™ LITE IP-ASN 数据库将是一个概念证明。

桌子...

CREATE TABLE `ip2loc_asn` (
  `asn` bigint(20) DEFAULT NULL,
  `cidr` varchar(50) DEFAULT NULL,
  `start_ip` bigint(20) DEFAULT NULL,
  `end_ip` bigint(20) DEFAULT NULL,
  `name` varchar(250) DEFAULT NULL,
  KEY `ip2locasn_startip_endip` (`start_ip`,`end_ip`),
  KEY `asn` (`asn`),
  KEY `cidr` (`cidr`)
) ENGINE=MyISAM; -- table is recreated monthly, MyISAM is the perfect engine
Run Code Online (Sandbox Code Playgroud)

样本数据...

select * from ip2loc_asn limit 10;
+-------+--------------+----------+----------+-------------------------------+
| asn   | cidr         | start_ip | end_ip   | name                          |
+-------+--------------+----------+----------+-------------------------------+
| 56203 | 1.0.4.0/24   | 16778240 | 16778495 | Big Red Group                 |
| 56203 | 1.0.5.0/24   | 16778496 | 16778751 | Big Red Group                 |  
| 56203 | 1.0.6.0/24   | 16778752 | 16779007 | Big Red Group                 |  
| 38803 | 1.0.7.0/24   | 16779008 | 16779263 | Goldenit Pty ltd Australia, A |  
| 18144 | 1.0.64.0/18  | 16793600 | 16809983 | Energia Communications,Inc.   |
|  9737 | 1.0.128.0/17 | 16809984 | 16842751 | TOT Public Company Limited    |
|  9737 | 1.0.128.0/18 | 16809984 | 16826367 | TOT Public Company Limited    |
|  9737 | 1.0.128.0/19 | 16809984 | 16818175 | TOT Public Company Limited    |
| 23969 | 1.0.128.0/24 | 16809984 | 16810239 | TOT Public Company Limited    |
| 23969 | 1.0.129.0/24 | 16810240 | 16810495 | TOT Public Company Limited    |
+-------+--------------+----------+----------+-------------------------------+
10 rows in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)

网络掩码范围从 8 到 32 位...

select min(substring_index(cidr,'/',-1)+0), max(substring_index(cidr,'/',-1)+0) from ip2loc_asn;
+-------------------------------------+-------------------------------------+
| min(substring_index(cidr,'/',-1)+0) | max(substring_index(cidr,'/',-1)+0) |
+-------------------------------------+-------------------------------------+
|                                   8 |                                  32 |
+-------------------------------------+-------------------------------------+
1 row in set (0.33 sec)

select * from ip2loc_asn where cidr like '%/8' limit 1;
+------+-----------+----------+----------+------------------------------+
| asn  | cidr      | start_ip | end_ip   | name                         |
+------+-----------+----------+----------+------------------------------+
| 3356 | 4.0.0.0/8 | 67108864 | 83886079 | Level 3 Communications, Inc. |
+------+-----------+----------+----------+------------------------------+
1 row in set (0.00 sec)

select * from ip2loc_asn where cidr like '%/32' limit 1;
+-------+---------------+-----------+-----------+------+
| asn   | cidr          | start_ip  | end_ip    | name |
+-------+---------------+-----------+-----------+------+
| 51964 | 57.72.27.1/32 | 961026817 | 961026817 |      |
+-------+---------------+-----------+-----------+------+
1 row in set (0.02 sec)
Run Code Online (Sandbox Code Playgroud)

当前执行计划...

explain select * from ip2loc_asn where inet_aton('10.20.30.40') between start_ip and end_ip;
+----+-------------+------------+-------+--------------------------+--------------------------+---------+------+-------+-----------------------+
| id | select_type | table      | type  | possible_keys            | key                      | key_len | ref  | rows  | Extra                 |
+----+-------------+------------+-------+--------------------------+--------------------------+---------+------+-------+-----------------------+
|  1 | SIMPLE      | ip2loc_asn | range | ip2loc_asn_startip_endip | ip2loc_asn_startip_endip | 9       | NULL | 10006 | Using index condition |
+----+-------------+------------+-------+--------------------------+--------------------------+---------+------+-------+-----------------------+
1 row in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)

我笨拙的尝试...

mysql to3_reference> alter table ip2loc_asn add column shifted_netmask int(10) unsigned;
Query OK, 626695 rows affected (4.06 sec)
Records: 626695  Duplicates: 0  Warnings: 0

mysql to3_reference> update ip2loc_asn set shifted_netmask = start_ip>>(32-substring_index(cidr,'/',-1));
Query OK, 626695 rows affected (5.98 sec)
Rows matched: 626695  Changed: 626695  Warnings: 0

mysql to3_reference> alter table ip2loc_asn add key ip2loc_asn_shiftednetmask (shifted_netmask);
Query OK, 626695 rows affected (5.83 sec)
Records: 626695  Duplicates: 0  Warnings: 0
Run Code Online (Sandbox Code Playgroud)

旧方式:

select * from ip2loc_asn where inet_aton('8.8.8.0') between start_ip and end_ip;
+-------+------------+-----------------+--------------+-----------+-----------+------------------------------+
| asn   | cidr       | shifted_netmask | netmask_bits | start_ip  | end_ip    | name                         |
+-------+------------+-----------------+--------------+-----------+-----------+------------------------------+
|  3356 | 8.0.0.0/9  |              16 |            9 | 134217728 | 142606335 | Level 3 Communications, Inc. |
|  3356 | 8.0.0.0/8  |               8 |            8 | 134217728 | 150994943 | Level 3 Communications, Inc. |
| 15169 | 8.8.8.0/24 |          526344 |           24 | 134744064 | 134744319 | Google Inc.                  |
+-------+------------+-----------------+--------------+-----------+-----------+- -----------------------------+
3 rows in set (0.00 sec)
Run Code Online (Sandbox Code Playgroud)

一种使用 shift_netmask 的方法(不可取 - 我正在进行全表扫描以发现网络掩码中的位数)...

select * from ip2loc_asn where shifted_netmask = inet_aton('8.8.8.0')>>32-netmask_bits;
+-------+------------+-----------------+--------------+-----------+-----------+------------------------------+
| asn   | cidr       | shifted_netmask | netmask_bits | start_ip  | end_ip    | name                         |
+-------+------------+-----------------+--------------+-----------+-----------+------------------------------+
|  3356 | 8.0.0.0/8  |               8 |            8 | 134217728 | 150994943 | Level 3 Communications, Inc. |
|  3356 | 8.0.0.0/9  |              16 |            9 | 134217728 | 142606335 | Level 3 Communications, Inc. |
| 15169 | 8.8.8.0/24 |          526344 |           24 | 134744064 | 134744319 | Google Inc.                  |
+-------+------------+-----------------+--------------+-----------+-----------+------------------------------+
3 rows in set (0.64 sec)
Run Code Online (Sandbox Code Playgroud)

所需的方法类似于最后一个查询减去对网络掩码位的扫描。

Eva*_*oll 5

PostgreSQL

作为旁注 PostgreSQL,使用cidrinettypes 来实现这一点。如果你真的想让这份工作成为一流的ip4r

在我看来,我可以存储正确移位的数字,并且可以匹配类似移位的 IP 地址(@cidr 的情况下为 660510)...

好主意,这实际上是 PostgreSQL 在内部存储它们的方式。轻松搞定,

CREATE TABLE ip2loc_asn (
  asn    bigint,
  cidr   cidr,
  name   text
);
CREATE INDEX ON ip2loc_asn USING gist(cidr);

INSERT INTO ip2loc_asn(asn,cidr,name)
VALUES
    ( 56203,  '1.0.4.0/24'   , 'Big Red Group' ),
    ( 56203,  '1.0.5.0/24'   , 'Big Red Group' ),
    ( 56203,  '1.0.6.0/24'   , 'Big Red Group' ),
    ( 38803,  '1.0.7.0/24'   , 'Goldenit Pty ltd Australia, A' ),
    ( 18144,  '1.0.64.0/18'  , 'Energia Communications,Inc.'   ),
    (  9737,  '1.0.128.0/17' , 'TOT Public Company Limited'    ),
    (  9737,  '1.0.128.0/18' , 'TOT Public Company Limited'    ),
    (  9737,  '1.0.128.0/19' , 'TOT Public Company Limited'    ),
    ( 23969,  '1.0.128.0/24' , 'TOT Public Company Limited'    ),
    ( 23969,  '1.0.129.0/24' , 'TOT Public Company Limited'    );
Run Code Online (Sandbox Code Playgroud)

现在我们可以使用网络类型运算符查询它

test=# SELECT * FROM ip2loc_asn WHERE cidr >> '1.0.129.0';
  asn  |     cidr     |            name            
-------+--------------+----------------------------
  9737 | 1.0.128.0/17 | TOT Public Company Limited
  9737 | 1.0.128.0/18 | TOT Public Company Limited
  9737 | 1.0.128.0/19 | TOT Public Company Limited
 23969 | 1.0.129.0/24 | TOT Public Company Limited
Run Code Online (Sandbox Code Playgroud)

这也发生在索引上。


Ric*_*mes 2

主要问题是优化器不知道是否有一对或一组匹配的起始端。因此,任何优化尝试都会被表扫描或至少大范围扫描所困扰。

你必须从哪一个开始?IP 地址?或者 CIDR 块?我这么问是因为我们可能需要重新排列您开始使用的数据,以便有效地查找其他数据。

本文中,我将解释如何构建和维护所有 2^32(或 IPv6 等效)IP 地址的表。它仅使用一start_ip列,并end_ip从下一行推断。这意味着所有未分配的 IP 范围必须在表中具有一行。(这并不是一个很大的负担,至多使行数增加一倍。)这样一来,几乎所有操作本质上都是 O(1) —— 也就是说,类似于WHERE ip >= start_ip ORDER BY start_ip DESC LIMIT 1“立即”得到答案。无表扫描、无范围扫描;没有什么比“点查询”(有效)更糟糕的了。请注意,它甚至不需要测试 end_ip。 警告:不处理重叠范围。 某些应用程序(可能不是您的)可以调整为不需要重叠。

如何使其适应 CIDR?一种方法是将您的 CIDR 表转换为我的变体。您熟悉如何做到这一点;主要区别是缺少 end_ip 和添加“无主”范围。因此,如果您“从”CIDR 开始并需要查找 IP,那么这是一个可能的答案。