相关疑难解决方法(0)

如何在正则表达式中跨多行匹配任何字符？

例如,这个正则表达式

(.*)<FooBar>

Run Code Online (Sandbox Code Playgroud)

将匹配:

abcde<FooBar>

Run Code Online (Sandbox Code Playgroud)

但是如何让它在多行中匹配呢？

abcde
fghij<FooBar>

Run Code Online (Sandbox Code Playgroud)

regex multiline

and*_*yuk

2015 04-20

315
推荐指数

15
解决办法

53万
查看次数

在两个子串之间查找字符串

如何在两个子串('123STRINGabc' -> 'STRING')之间找到一个字符串？

我目前的方法是这样的:

>>> start = 'asdf=5;'
>>> end = '123jasd'
>>> s = 'asdf=5;iwantthis123jasd'
>>> print((s.split(start))[1].split(end)[0])
iwantthis

Run Code Online (Sandbox Code Playgroud)

然而,这似乎是非常低效和非pythonic.做这样的事情有什么更好的方法？

忘记提及:字符串可能无法以start和开头和结尾end.他们之前和之后可能会有更多的角色.

python string substring

Joh*_*ard

2010 07-30

206
推荐指数

11
解决办法

33万
查看次数

Python正则表达式 - r前缀

任何人都可以解释为什么下面的示例1有效,何时r不使用前缀？我认为r只要使用转义序列,就必须使用前缀.示例2和示例3证明了这一点.

# example 1
import re
print (re.sub('\s+', ' ', 'hello     there      there'))
# prints 'hello there there' - not expected as r prefix is not used

# example 2
import re
print (re.sub(r'(\b\w+)(\s+\1\b)+', r'\1', 'hello     there      there'))
# prints 'hello     there' - as expected as r prefix is used

# example 3
import re
print (re.sub('(\b\w+)(\s+\1\b)+', '\1', 'hello     there      there'))
# prints 'hello     there      there' - as expected as r prefix is not used

Run Code Online (Sandbox Code Playgroud)

python regex string literals prefix

JT.*_*JT.

2018 12-19

69
推荐指数

3
解决办法

7万
查看次数

什么是更快的操作,re.match/search或str.find？

对于一次性字符串搜索,使用str.find/rfind比使用re.match/search更快吗？

也就是说,对于给定的字符串s,我应该使用:

if s.find('lookforme') > -1:
    do something

Run Code Online (Sandbox Code Playgroud)

要么

if re.match('lookforme',s):
    do something else

Run Code Online (Sandbox Code Playgroud)

？

python performance

Mik*_*ron

lucky-day

60
推荐指数

6
解决办法

4万
查看次数

在Facebook Graph API 2.0中获取用户名字段

"旧的"Facebook Graph API有一个"用户名"字段,可用于创建人类可读的配置文件URL.我的用户名例如是"sebastian.trug",它会生成Facebook个人资料网址http://www.facebook.com/sebastian.trug.

使用Graph API 2.0 Facebook已从"/ me"中检索的用户数据中删除了"用户名"字段.

有没有办法通过2.0 API获取此数据,或者"用户名"现在被视为已弃用的字段？

facebook facebook-graph-api

tru*_*ueg

lucky-day

57
推荐指数

5
解决办法

6万
查看次数

使用Python正则表达式提取数据

我在使用Python正则表达式时遇到麻烦,想出一个正则表达式来提取特定值.

我试图解析的页面有许多productIds,它们以下列格式显示

\"productId\":\"111111\"

Run Code Online (Sandbox Code Playgroud)

111111在这种情况下,我需要提取所有值.

python regex parsing

gre*_*fox

2015 11-18

12
推荐指数

2
解决办法

6万
查看次数

Python - 提取子字符串的最优雅方式,给出左右边框

我有一个字符串 - Python:

string = "/foo13546897/bar/Atlantis-GPS-coordinates/bar457822368/foo/"

Run Code Online (Sandbox Code Playgroud)

预期产出是:

"Atlantis-GPS-coordinates"

Run Code Online (Sandbox Code Playgroud)

我知道预期的输出总是被左边的"/ bar /"和右边的"/"包围:

"/bar/Atlantis-GPS-coordinates/"

Run Code Online (Sandbox Code Playgroud)

建议的解决方案如下:

a = string.find("/bar/")
b = string.find("/",a+5)
output=string[a+5,b]

Run Code Online (Sandbox Code Playgroud)

这有效,但我不喜欢它.有人知道一个美丽的功能或提示吗？

python string find

Vin*_*ent

lucky-day

9
推荐指数

1
解决办法

2万
查看次数

使用python从<script>中的javascript var中提取数据

我是 python、BeautifulSoup 和其他新手，但我想提取 json 数据，这些数据位于网站“脚本”标签中的 javascript 变量内。

这是我现在的代码：

import re
from bs4 import BeautifulSoup
import json
import requests
url = 'myUrl'
page = requests.get(url).content
soup = BeautifulSoup(page, "html.parser")
pattern = re.compile(r"var hours = .")
script = soup.find("script",text=pattern)
print(script)

Run Code Online (Sandbox Code Playgroud)

现在我可以使用以下格式提取数据：

<script>
var hours = [{...dataIwant...}];
<\script>

Run Code Online (Sandbox Code Playgroud)

但我只想要没有“脚本”或“var hours =”的数据。我想在json中更改它并将其放入apache nifi中。

我已经尝试了几乎所有在这里和谷歌上找到的东西。但大多数情况下，当我尝试提取变量并将其更改为 json 格式时，我会遇到“无”或其他错误。

因此，如果您有一些技巧可以帮助我以 json 格式获取数据，那就太好了！

谢谢！

javascript python json apache-nifi

scu*_*-gm

2017 11-28

5
推荐指数

1
解决办法

3041
查看次数

Python函数在两个标记之间查找字符串

我正在寻找一个字符串函数来提取两个标记之间的字符串内容.它返回一个提取列表

def extract(raw_string, start_marker, end_marker):
    ... function ...
    return extraction_list

Run Code Online (Sandbox Code Playgroud)

我知道这可以使用正则表达式来完成但是这很快吗？这将在我的过程中被称为数十亿次.最快的方法是什么？

如果标记相同且出现和奇数次会发生什么？

如果开始和结束标记出现多次,则该函数应返回多个字符串.

python regex string

Mat*_*ock

2011 10-13

3
推荐指数

1
解决办法

4243
查看次数

如何在Python中从字符串中提取多个子字符串？

我指的是如何从Python中的字符串中提取子字符串的问题？并有进一步的问题。

如果我的字符串是这样的怎么办：

gfgfdAAA1234ZZZsddgAAA4567ZZZuijjk

我想提取1234和4567，它是否存储为列表？

python regex

lok*_*art

2017 05-23

3
推荐指数

1
解决办法

3985
查看次数

如何从python中的特定字符串中切片字符串

好的，所以可以说我有

s = 'ABC Here DEF GHI toHere JKL'

Run Code Online (Sandbox Code Playgroud)

我希望得到一个新的字符串之间只有字符串Here和toHere

new_str = 'DEF GHI'

Run Code Online (Sandbox Code Playgroud)

（我不知道我之前Here或其他任何地方有多少或哪些字符）我只知道我有Here和toHere在字符串中。我怎样才能得到new_str？

python

Dan*_*gal

2018 10-11

-1
推荐指数

1
解决办法

154
查看次数

查找粗体/下划线字符串的正确正则表达式（Python）

所以我想在字符串中找到两组标准。例如：

import re
bold_pattern = re.compile() #pattern for finding all words in between ** **
underline_pattern = re.compile() # pattern for finding all words in between __ __
a = "__Hello__ **This** __is__ **Lego**"

Run Code Online (Sandbox Code Playgroud)

我将如何在正则表达式上做到这一点？

python regex

Leg*_*490

lucky-day

-2
推荐指数

1
解决办法

1001
查看次数