标签: regex-lookarounds

正则表达式否定前瞻

我需要修改这个正则表达式

href=\"(.*)\"

Run Code Online (Sandbox Code Playgroud)

哪个匹配...

href="./pothole_locator_map.aspx?lang=en-gb&lat=53.153977&lng=-3.533306"

Run Code Online (Sandbox Code Playgroud)

要不匹配这个......

href="./pothole_locator_map.aspx?lang=en-gb&lat=53.153977&lng=-3.533306&returnurl=AbandonedVehicles.aspx"

Run Code Online (Sandbox Code Playgroud)

试过这个,但没有运气

href=\"(.*)\"(?!&returnurl=AbandonedVehicles.aspx)

Run Code Online (Sandbox Code Playgroud)

任何帮助将非常感激.

谢谢,Al.

regex negative-lookahead regex-lookarounds

gen*_*ion

2012 08-29

5
推荐指数

1
解决办法

498
查看次数

正则表达式 - 混淆外观功能

如果我写

(?<=\()\w+(?=\))

Run Code Online (Sandbox Code Playgroud)

对于这个字符串:(测试)(测试2)(测试3)

我会得到:Test Test2 Test3

那讲得通.

如果我写

\w+ (?<=\()\w+(?=\))

Run Code Online (Sandbox Code Playgroud)

对于这个字符串:LTE(测试)

什么都没有回报..问题是什么？

请清楚解释你的正则表达式,因为它很难阅读.

regex regex-lookarounds

ham*_*obi

2013 08-15

5
推荐指数

1
解决办法

105
查看次数

从字符向量中提取和计算常见的单词对

如何在角色向量中找到频繁的相邻单词对？例如,使用原油数据集,一些常见的货币对是"原油","石油市场"和"百万桶".

下面的小例子的代码试图识别频繁的术语,然后使用正向前瞻断言,计算频繁术语立即跟随这些频繁术语的次数.但是这次尝试坠毁并烧毁了.

任何指导都将被理解为如何创建在第一列("对")中显示公共对的数据帧以及在第二列("计数")中显示它们在文本中出现的次数.

   library(qdap)
   library(tm)

# from the crude data set, create a text file from the first three documents, then clean it

text <- c(crude[[1]][1], crude[[2]][1], crude[[3]][1])
text <- tolower(text)
text <- tm::removeNumbers(text)
text <- str_replace_all(text, "  ", "") # replace double spaces with single space
text <- str_replace_all(text, pattern = "[[:punct:]]", " ")
text <- removeWords(text, stopwords(kind = "SMART"))

# pick the top 10 individual words by frequency, since they will likely form the most common pairs
freq.terms …

Run Code Online (Sandbox Code Playgroud)

r tm regex-lookarounds qdap

law*_*yeR

2017 05-23

5
推荐指数

1
解决办法

2504
查看次数

如何检测未引用或双引号的空间

我正在尝试创建一个Java正则表达式,它将用一个空格替换字符串中出现的所有空格,除非引号之间出现白色空格(单引号或双引号)

如果我只是在寻找双引号,我可以使用前瞻:

text.replaceAll("\\s+ (?=(?:[^\"]*\"[^\"]*\")*[^\"]*$)", " ");

Run Code Online (Sandbox Code Playgroud)

如果我只是在寻找单引号,我可以使用类似的模式.

诀窍是找到两者.

我有个好主意运行双引号模式后跟单引号模式,但当然最终会替换所有空格而不管引号.

所以这里有一些测试和预期结果

a   b   c    d   e   -->  a b c d e
a   b   "c    d"   e -->  a b "c    d" e
a   b   'c    d'   e -->  a b 'c    d' e
a   b   "c    d'   e -->  a b "c d' e    (Can't mix and match quotes)

Run Code Online (Sandbox Code Playgroud)

有没有办法在Java正则表达式中实现这一目标？

假设已经单独验证了无效输入.因此,以下任何一种都不会发生:

a "b c ' d
a 'b " c' d
a 'b c d

Run Code Online (Sandbox Code Playgroud)

java regex regex-negation regex-lookarounds

Vic*_*azi

2015 12-24

5
推荐指数

1
解决办法

162
查看次数

如何捕获正则表达式python中的两个前瞻

这是一个字符串：

str = "Academy \nADDITIONAL\nAwards and Recognition: Greek Man of the Year 2011 Stanford PanHellenic Community, American Delegate 2010 Global\nEngagement Summit, Honorary Speaker 2010 SELA Convention, Semi-Finalist 2010 Strauss Foundation Scholarship Program\nComputer Skills: Competency: MATLAB, MySQL/PHP, JavaScript, Objective-C, Git Proficiency: Adobe Creative Suite, Excel\n(highly advanced), PowerPoint, HTML5/CSS3\nLanguages: Fluent English, Advanced Spanish\n\x0c"

Run Code Online (Sandbox Code Playgroud)

我想从“ ADDTIONAL”捕捉到“ Languages”，所以我写了这个正则表达式：

regex = r'(?<=\n(ADDITIONAL|Additional)\n)[\s\S]+?(?=\n(Languages|LANGUAGES)\n*)'

Run Code Online (Sandbox Code Playgroud)

但是，它只能捕获介于两者之间的所有内容([\s\S]+)。它不会捕获ADDTIONAL＆Languages。我在这里想念什么？

python regex regex-lookarounds

Aer*_*rin

2016 04-26

5
推荐指数

1
解决办法

990
查看次数

匹配带引号的csv中未转义的引号

我看过几篇标题相似的Stack Overflow帖子，但没有一个被接受的答案对我有用。

我有一个CSV文件，其中数据的每个“单元”均以逗号分隔并加引号（包括数字）。每行以换行符结尾。

一些文本“单元格”中带有引号，我想使用正则表达式来查找它们，以便我可以正确地对它们进行转义。

示例行：

"0","0.23432","234.232342","data here dsfsd hfsdf","3/1/2016",,"etc","E 60"","AD"8"\n

Run Code Online (Sandbox Code Playgroud)

我想匹配只是"在E 60"和中AD"8，而不是任何其他的"。

我可以用来执行此操作的（最好是Python友好的）正则表达式是什么？

python regex csv regex-lookarounds

sun*_*nce

2017 04-26

5
推荐指数

1
解决办法

685
查看次数

正则表达式负面向前看以匹配降价链接

我们陷入了正则表达式问题.

这是问题所在.考虑以下两种模式:

1) [hello] [world]

2) [hello [world]]

我们需要编写一个只能[world]在第一个匹配的正则表达式和[hello [world]]第二个匹配的整个模式().

通过使用负向前瞻,我编写了以下正则表达式,它解决了部分问题:

\[[^\[\]]+\](?!.*\[[^\[\]]+\])

Run Code Online (Sandbox Code Playgroud)

这个正则表达式匹配模式1)我们想要,但不适用于模式2).

.net regex regex-lookarounds

Enr*_*one

2017 10-20

5
推荐指数

1
解决办法

144
查看次数

Python正则表达式的回顾与展望

我需要从具有以下格式的字符串中匹配字符串“ foo”：

string = "/foo/boo/poo"

Run Code Online (Sandbox Code Playgroud)

我绑了这段代码：

poo = "poo"
foo = re.match('.*(?=/' + re.escape(poo) + ')', string).group(0)

Run Code Online (Sandbox Code Playgroud)

它为我/foo/boo提供了foo变量的内容（而不是just foo/boo）。

我尝试了这段代码：

poo = "poo"
foo = re.match('(?=/).*(?=/' + re.escape(poo) + ')', string).group(0)

Run Code Online (Sandbox Code Playgroud)

我得到的是相同的输出（/foo/boo而不是foo/boo）。

我如何只匹配foo/boo零件？

regex lookbehind regex-lookarounds

Joh*_*lis

lucky-day

5
推荐指数

2
解决办法

5462
查看次数

正则表达式用于匹配仅由字母列表构成的单词

给定一组单词，我需要知道哪些单词仅由一组字母组成。即使此字母是验证集的一部分，该单词的字母也不能超过允许的数量。

例：

Char set: a, a, ã, c, e, l, m, m, m, o, o, o, o, t (fixed set)

Words set: mom, ace, to, toooo, ten, all, aaa (variable set)

Run Code Online (Sandbox Code Playgroud)

结果：

mom = true
ace = true
to = true
toooo = true
ten = false (n is not in the set)
all = false (there is only 1 L in the set)
aaa = false (theres is only 2 A in the set)

Run Code Online (Sandbox Code Playgroud)

如何在Javascript中生成此正则表达式？（区分大小写不是问题）。

我尝试了以下代码，但未成功：

var str = …

Run Code Online (Sandbox Code Playgroud)

javascript regex regex-group regex-greedy regex-lookarounds

Edu*_*tel

2019 05-08

5
推荐指数

1
解决办法

113
查看次数

匹配日期的正则表达式（月日、年或 m/d/yy）

我正在尝试编写一个正则表达式，该表达式可用于在字符串中查找日期，该字符串前面（或后面）可能有空格、数字、文本、行尾等。该表达式应处理美国日期格式要么

1) Month Name Day, Year - 即 2019 年 1 月 10 日或
2) mm/dd/yy - 即 11/30/19

我为月份名称，年份找到了这个

(Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4}

Run Code Online (Sandbox Code Playgroud)

（感谢 Veverke 在这里Regex 匹配日期，如月份名称日逗号和年份

这对于 mm/dd/yy（以及 m/d/y 的各种组合）

(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2}

Run Code Online (Sandbox Code Playgroud)

（在此感谢 Steven Levithan 和 Jan Goyvaerts https://www.oreilly.com/library/view/regular-expressions-cookbook/9781449327453/ch04s04.html

我试图把它们像这样结合起来

((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4})|((1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2})

Run Code Online (Sandbox Code Playgroud)

当我在输入字符串“Paid on 1/1/2019”中搜索“on [regex above]”时，它确实找到了日期，但没有找到“on”这个词。如果我只是使用，则找到该字符串

(1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2}

Run Code Online (Sandbox Code Playgroud)

谁能看到我做错了什么？

编辑

我正在使用下面的 c# .net 代码：

    string stringToSearch = "Paid on 1/1/2019";
    string searchPattern = @"on ((Jan(uary)?|Feb(ruary)?|Mar(ch)?|Apr(il)?|May|Jun(e)?|Jul(y)?|Aug(ust)?|Sep(tember)?|Oct(ober)?|Nov(ember)?|Dec(ember)?)\s+\d{1,2},\s+\d{4})|((1[0-2]|0?[1-9])/(3[01]|[12][0-9]|0?[1-9])/(?:[0-9]{2})?[0-9]{2})";
    var match = Regex.Match(stringToSearch, searchPattern, RegexOptions.IgnoreCase);


    string foundString;
    if (match.Success)
        foundString= stringToSearch.Substring(match.Index, match.Length);

Run Code Online (Sandbox Code Playgroud)

例如