mie*_*nik 288 python string substring
假设我有一个字符串'gfgfdAAA1234ZZZuijjk',我想只提取'1234'部分.
我只知道在我感兴趣的部分之前AAA和之后ZZZ的几个字符是什么1234.
使用sed它可以用字符串做这样的事情:
echo "$STRING" | sed -e "s|.*AAA\(.*\)ZZZ.*|\1|"
Run Code Online (Sandbox Code Playgroud)
这将给我1234带来的结果.
如何在Python中做同样的事情?
eum*_*iro 511
使用正则表达式 - 文档以供进一步参考
import re
text = 'gfgfdAAA1234ZZZuijjk'
m = re.search('AAA(.+?)ZZZ', text)
if m:
found = m.group(1)
# found: 1234
Run Code Online (Sandbox Code Playgroud)
要么:
import re
text = 'gfgfdAAA1234ZZZuijjk'
try:
found = re.search('AAA(.+?)ZZZ', text).group(1)
except AttributeError:
# AAA, ZZZ not found in the original string
found = '' # apply your error handling
# found: 1234
Run Code Online (Sandbox Code Playgroud)
Len*_*bro 105
>>> s = 'gfgfdAAA1234ZZZuijjk'
>>> start = s.find('AAA') + 3
>>> end = s.find('ZZZ', start)
>>> s[start:end]
'1234'
Run Code Online (Sandbox Code Playgroud)
然后,如果需要,也可以将regexps与re模块一起使用,但在您的情况下这不是必需的.
tzo*_*zot 52
import re
re.search(r"(?<=AAA).*?(?=ZZZ)", your_text).group(0)
Run Code Online (Sandbox Code Playgroud)
AttributeError如果没有"AAA"和"ZZZ",则上述原样将失败your_text
your_text.partition("AAA")[2].partition("ZZZ")[0]
Run Code Online (Sandbox Code Playgroud)
如果"AAA"或"ZZZ"中不存在,则上面将返回空字符串your_text.
PS Python挑战赛?
inf*_*red 14
import re
print re.search('AAA(.*?)ZZZ', 'gfgfdAAA1234ZZZuijjk').group(1)
Run Code Online (Sandbox Code Playgroud)
小智 9
您可以只使用一行代码
>>> import re
>>> re.findall(r'\d{1,5}','gfgfdAAA1234ZZZuijjk')
>>> ['1234']
Run Code Online (Sandbox Code Playgroud)
结果将收到清单...
令人惊讶的是没有人提到这是我一次性脚本的快速版本:
>>> x = 'gfgfdAAA1234ZZZuijjk'
>>> x.split('AAA')[1].split('ZZZ')[0]
'1234'
Run Code Online (Sandbox Code Playgroud)
text = 'I want to find a string between two substrings'
left = 'find a '
right = 'between two'
print(text[text.index(left)+len(left):text.index(right)])
Run Code Online (Sandbox Code Playgroud)
给予
string
Run Code Online (Sandbox Code Playgroud)
您可以使用re模块:
>>> import re
>>> re.compile(".*AAA(.*)ZZZ.*").match("gfgfdAAA1234ZZZuijjk").groups()
('1234,)
Run Code Online (Sandbox Code Playgroud)
小智 6
>>> s = '/tmp/10508.constantstring'
>>> s.split('/tmp/')[1].split('constantstring')[0].strip('.')
Run Code Online (Sandbox Code Playgroud)
在python中,可以使用findall正则表达式( re)模块中的方法来提取子串形式的字符串。
>>> import re
>>> s = 'gfgfdAAA1234ZZZuijjk'
>>> ss = re.findall('AAA(.+)ZZZ', s)
>>> print ss
['1234']
Run Code Online (Sandbox Code Playgroud)
使用sed可以使用字符串执行类似的操作:
echo "$STRING" | sed -e "s|.*AAA\(.*\)ZZZ.*|\1|"
结果这将给我1234.
您可以re.sub使用相同的正则表达式执行相同的功能.
>>> re.sub(r'.*AAA(.*)ZZZ.*', r'\1', 'gfgfdAAA1234ZZZuijjk')
'1234'
Run Code Online (Sandbox Code Playgroud)
在基本的sed中,捕获组由表示\(..\),但在python中它由表示(..).
您可以在代码中使用此函数找到第一个子字符串(按字符索引)。此外,您还可以找到子字符串后面的内容。
def FindSubString(strText, strSubString, Offset=None):
try:
Start = strText.find(strSubString)
if Start == -1:
return -1 # Not Found
else:
if Offset == None:
Result = strText[Start+len(strSubString):]
elif Offset == 0:
return Start
else:
AfterSubString = Start+len(strSubString)
Result = strText[AfterSubString:AfterSubString + int(Offset)]
return Result
except:
return -1
# Example:
Text = "Thanks for contributing an answer to Stack Overflow!"
subText = "to"
print("Start of first substring in a text:")
start = FindSubString(Text, subText, 0)
print(start); print("")
print("Exact substring in a text:")
print(Text[start:start+len(subText)]); print("")
print("What is after substring \"%s\"?" %(subText))
print(FindSubString(Text, subText))
# Your answer:
Text = "gfgfdAAA1234ZZZuijjk"
subText1 = "AAA"
subText2 = "ZZZ"
AfterText1 = FindSubString(Text, subText1, 0) + len(subText1)
BeforText2 = FindSubString(Text, subText2, 0)
print("\nYour answer:\n%s" %(Text[AfterText1:BeforText2]))
Run Code Online (Sandbox Code Playgroud)
使用 PyParsing
import pyparsing as pp
word = pp.Word(pp.alphanums)
s = 'gfgfdAAA1234ZZZuijjk'
rule = pp.nestedExpr('AAA', 'ZZZ')
for match in rule.searchString(s):
print(match)
Run Code Online (Sandbox Code Playgroud)
产生:
[['1234']]
Python 3.8 if 的一个衬垫text保证包含子字符串:
text[text.find(start:='AAA')+len(start):text.find('ZZZ')]
Run Code Online (Sandbox Code Playgroud)