如何解析 PostgreSQL 中的地址?

Eva*_*oll 6 postgresql address

例如,假设我想为 Chicken Ranch 解析这些地址

Chicken Ranch
10511 Homestead Rd
Pahrump, NV 89061

Chicken Ranch
1600 Pennsylvania Avenue
NW Washington, D.C. 20500
Run Code Online (Sandbox Code Playgroud)

在这两种情况下,我都想摆脱RdAvenue。例如,在第一种情况下,我想获得“Homestead”,而在第二个“Pennsylvania”中。不过,并非每个地址都有这样的名称。

Eva*_*oll 8

这是一个问题地址规范化和解析。基本上,您所谈论的内容是通过地名词典(地理规则集)处理的。有两种方法可以做到这一点,

  1. address_standardizer 来自 PostGIS 项目,如果您只使用美国地址,当然更好。
  2. pgsql-postal 可能是国际地址的更好方法。

我将显示地址的地址标准化版本,

Chicken Ranch
10511 Homestead Rd
Pahrump, NV 89061
Run Code Online (Sandbox Code Playgroud)

使用standardize_addressfrom address_standardizer,返回 的复合类型stdaddr。首先我们安装它,

CREATE EXTENSION address_standardizer;
CREATE EXTENSION address_standardizer_data_us;
Run Code Online (Sandbox Code Playgroud)

然后我们可以像这样使用它。

SELECT * FROM standardize_address('us_lex',
   'us_gaz', 'us_rules', '10511 Homestead Rd, Pahrump, NV 89061');
 building | house_num | predir | qual | pretype |   name    | suftype | sufdir | ruralroute | extra |  city   | state  | country | postcode | box | unit 
----------+-----------+--------+------+---------+-----------+---------+--------+------------+-------+---------+--------+---------+----------+-----+------
          | 10511     |        |      |         | HOMESTEAD | ROAD    |        |            |       | PAHRUMP | NEVADA | USA     | 89061    |     | 
(1 row)
Run Code Online (Sandbox Code Playgroud)

所以你可以看到,ROAD被拉出来了suftype

同样地,...

SELECT * FROM standardize_address('us_lex',
   'us_gaz', 'us_rules', '1600 Pennsylvania Avenue, NW Washington, D.C. 20500');
 building | house_num | predir | qual | pretype |     name     | suftype |  sufdir   | ruralroute | extra | city | state | country | postcode  | box |     unit     
----------+-----------+--------+------+---------+--------------+---------+-----------+------------+-------+------+-------+---------+-----------+-----+--------------
          | 1600      |        |      |         | PENNSYLVANIA | AVENUE  | NORTHWEST |            |       |      |       | USA     | D C 20500 |     | # WASHINGTON
Run Code Online (Sandbox Code Playgroud)