如何将文本拆分为 URL 数组和空格分隔的短语？

Question

如何将文本拆分为 URL 数组和空格分隔的短语？

我想根据 URL 拆分文本。

所以像这样的文字

const text = 'hello world, testing /sf/ this is prefix https://gmail.com final text'

Run Code Online (Sandbox Code Playgroud)

应该给

const result = [
    'hello world, testing',
    '/sf/',
    'this is prefix',
    'https://gmail.com',
    'final text'
]

Run Code Online (Sandbox Code Playgroud)

基本上任何 URL 都应该分割文本，但 URL 也应该包含在内

我确实尝试了一些方法，但无法为此创建算法。

/(http|https):\/\/[a-zA-Z0-9\-.]+\.[a-zA-Z]{2,3}(\/\S*)?/

Run Code Online (Sandbox Code Playgroud)

我确实尝试用这个正则表达式进行分割，但它不一致

Answer 1

anu*_*ava 5

您可以.split将此正则表达式与捕获组一起使用：

\s*(https?:\/\/\S+)\s*

Run Code Online (Sandbox Code Playgroud)

正则表达式演示

代码：

const text = 'hello world, testing https://stackoverflow.com/questions/ask this is prefix https://gmail.com final text';

var arr = text.trim().split(/\s*(https?:\/\/\S+)\s*/);

console.log(arr);

/*
['hello world, testing',
'https://stackoverflow.com/questions/ask',
'this is prefix',
'https://gmail.com',
'final text']
*/

Run Code Online (Sandbox Code Playgroud)

正则表达式的分解：

\s*：匹配0个或多个空格
(https?:\/\/\S+)：匹配任何以 1 个以上非空白字符开头http://或后跟的 URL。https://在第 1 组中捕获此内容，以便能够在结果数组中获取此内容。
\s*：匹配0个或多个空格

这太搞笑了：我从事 Web 开发已经超过 8 年了，在此期间我从一个完全的新手变成了一个相当有经验的开发人员，但我从来不知道[使用捕获组将分隔符包含到结果数组中](https ://i.stack.imgur.com/OQQ09.png）。很酷的技巧！ (3认同)

归档时间：	3 年前
查看次数：	105 次
最近记录：	3 年前