从字符串中提取地址

4 javascript php jquery

假设我有这个字符串:

<div>john doe is nice guy btw 8240 E. Marblehead Way 92808  is also</div>
Run Code Online (Sandbox Code Playgroud)

或者这个字符串:

<div>sky being blue? in the world is true? 024 Brea Mall  Brea, California 92821 jackfroast nipping on the firehead</div>
Run Code Online (Sandbox Code Playgroud)

我将如何从其中一个字符串中提取地址?这会涉及某种正则表达式,对吧?

我试过在线寻找使用JavaScript或PHP的解决方案,但无济于事.Stack Overflow上没有其他帖子(据我所知)提供了一个使用jQuery和/或Javascript和/或PHP的解决方案.(最接近的是Parse可用的街道地址,城市,州,Zip,来自字符串,其中没有关于从字符串中提取邮政编码的线程中的任何代码.

有人能指出我正确的方向吗?我将如何在jQuery或JavaScript或PHP中完成此操作?

Jos*_*ody 22

尝试了十二个与你的相似的不同字符串,它工作得很好:

function str_to_address($context) { 

    $context_parts = array_reverse(explode(" ", $context)); 
    $zipKey = ""; 
    foreach($context_parts as $key=>$str) { 
        if(strlen($str)===5 && is_numeric($str)) { 
            $zipKey = $key;
            break; 
        }
    }

    $context_parts_cleaned = array_slice($context_parts, $zipKey); 
    $context_parts_normalized = array_reverse($context_parts_cleaned); 
    $houseNumberKey = ""; 
    foreach($context_parts_normalized as $key=>$str) { 
        if(strlen($str)>1 && strlen($str)<6 && is_numeric($str)) { 
            $houseNumberKey = $key;
            break; 
        }
    }

    $address_parts = array_slice($context_parts_normalized, $houseNumberKey);
    $string = implode(' ', $address_parts);
    return $string;
}
Run Code Online (Sandbox Code Playgroud)

这假定门牌号至少为两位数,且不大于6位.这也假定邮政编码不是"扩展"形式(例如12345-6789).然而,这可以很容易地修改,以适应这种格式(正则表达式将是一个很好的选择,类似于(\d{5}-\d{4}).

但是使用正则表达式来解析用户输入的数据......这不是一个好主意,因为我们只是不知道用户将要输入什么,因为(可以假设)没有验证.

遍历代码和逻辑,从上下文创建数组并抓取zip:

// split the context (for example, a sentence) into an array, 
// so we can loop through it. 
// we reverse the array, as we're going to grab the zip first. 
// why? we KNOW the zip is 5 characters long*.
$context_parts = array_reverse(explode(" ", $context));  

// we're going to store the array index of the zip code for later use 
$zipKey = ""; 

// foreach iterates over an object given the params, 
// in this case it's like doing... 
// for each value of $context_parts ($str), and each index ($key)
foreach($context_parts as $key=>$str) { 

    // if $str is 5 chars long, and numeric... 
    // an incredibly lazy check for a zip code...
    if(strlen($str)===5 && is_numeric($str)) {  
        $zipKey = $key;

        // we have what we want, so we can leave the loop with break
        break; 
    }
}
Run Code Online (Sandbox Code Playgroud)

做一些整理,所以我们有一个更好的对象来装饰房屋号码

// remove junk from $context_array, since we don't 
// need stuff after the zip
$context_parts_cleaned = array_slice($context_parts, $zipKey); 

// since the house number comes first, let's go back to the start
$context_parts_normalized = array_reverse($context_parts_cleaned);
Run Code Online (Sandbox Code Playgroud)

然后让我们使用与我们执行邮政编码相同的基本逻辑来获取门牌号码:

$houseNumberKey = ""; 
foreach($context_parts_normalized as $key=>$str) { 
    if(strlen($str)>1 && strlen($str)<6 && is_numeric($str)) { 
        $houseNumberKey = $key;
        break; 
    }
}

// we probably have the parts we for the address.
// let's do some more cleaning 
$address_parts = array_slice($context_parts_normalized, $houseNumberKey);

// and build the string again, from the address
$string = implode(' ', $address_parts);

// and return the string
return $string;
Run Code Online (Sandbox Code Playgroud)

  • WOWW!谢谢你的回复!如此全面!!! 如此描述性!那个GOOOD!(顺便说一句,我授予你100分的赏金,所以现在你的名声是+100 :))我也标记你答案是正确的,并且也赞成了它.它适用于每一个测试,天气它在字符串中有其他数字编号或它没有!:) (4认同)
  • 再次感谢您的精彩回应! (2认同)
  • 我很高兴这个成功了!别客气! (2认同)
  • 不,只是你的天才!我在任何地方都找不到类似的剧本! (2认同)
  • 有时最好的解决方案通常是最简单的.:) (2认同)