Mik*_*ron 60 python performance
对于一次性字符串搜索,使用str.find/rfind比使用re.match/search更快吗?
也就是说,对于给定的字符串s,我应该使用:
if s.find('lookforme') > -1:
do something
Run Code Online (Sandbox Code Playgroud)
要么
if re.match('lookforme',s):
do something else
Run Code Online (Sandbox Code Playgroud)
?
use*_*312 136
问题:哪个更快,最好通过使用来回答timeit.
from timeit import timeit
import re
def find(string, text):
if string.find(text) > -1:
pass
def re_find(string, text):
if re.match(text, string):
pass
def best_find(string, text):
if text in string:
pass
print timeit("find(string, text)", "from __main__ import find; string='lookforme'; text='look'")
print timeit("re_find(string, text)", "from __main__ import re_find; string='lookforme'; text='look'")
print timeit("best_find(string, text)", "from __main__ import best_find; string='lookforme'; text='look'")
Run Code Online (Sandbox Code Playgroud)
输出是:
0.441393852234
2.12302494049
0.251421928406
Run Code Online (Sandbox Code Playgroud)
因此,您不仅应该使用in运算符,因为它更容易阅读,但因为它也更快.
Joc*_*zel 16
用这个:
if 'lookforme' in s:
do something
Run Code Online (Sandbox Code Playgroud)
需要首先编译正则表达式,这会增加一些开销.无论如何,Python的普通字符串搜索非常有效.
如果你经常搜索相同的术语,或者你做一些更复杂的事情,那么正则表达式会变得更有用.
小智 12
也许有人仍然感兴趣。\n给出的答案看起来不错,但只查看一个非常短的字符串。\n事实上,如果您采用一个长字符串并且您正在寻找的模式大致位于末尾,那么性能会发生变化,有利于正则表达式!
\n\nimport re\n\ndef find(string, text):\n if string.find(text) > -1:\n pass\n\ndef re_find(string, text):\n if re.match(text, string):\n pass\n\ndef best_find(string, text):\n if text in string:\n pass\n\nvery_long_string = \'sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd sasgda;dlaskjgasdlj sadlf;jsaf lkjvasdfa dsadkfldsfhsa svnsa;df adsfkj;ljkasdf asdf;lkjafd\'\npattern = \'look\'\nprint(\'pattern at the end of string\')\nprint(\'find:\', end=\' \')\n%timeit find(very_long_string + pattern, pattern)\nprint(\'regex:\', end=\' \')\n%timeit re_find(very_long_string + pattern, pattern)\nprint(\'in:\', end=\' \')\n%timeit best_find(very_long_string + pattern, pattern)\nprint(\'pattern in front of string\')\nprint(\'find:\', end=\' \')\n%timeit find(pattern + very_long_string, pattern)\nprint(\'regex:\', end=\' \')\n%timeit re_find(pattern + very_long_string, pattern)\nprint(\'in:\', end=\' \')\n%timeit best_find(pattern + very_long_string, pattern)\nRun Code Online (Sandbox Code Playgroud)\n\n给出输出:
\n\npattern at the end of string\nfind: 3.41 \xc2\xb5s \xc2\xb1 74.2 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 100000 loops each)\nregex: 1.93 \xc2\xb5s \xc2\xb1 23.8 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 1000000 loops each)\nin: 3.32 \xc2\xb5s \xc2\xb1 74.6 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 100000 loops each)\npattern in front of string\nfind: 748 ns \xc2\xb1 15.6 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 1000000 loops each)\nregex: 2.03 \xc2\xb5s \xc2\xb1 21.1 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 100000 loops each)\nin: 589 ns \xc2\xb1 6.75 ns per loop (mean \xc2\xb1 std. dev. of 7 runs, 1000000 loops each)\nRun Code Online (Sandbox Code Playgroud)\n\n摘要:findandin取决于字符串长度和模式在字符串中的位置,而在regex某种程度上与字符串长度无关,并且对于末尾带有模式的非常长的字符串来说速度更快。
只是为了完成有关正则表达式编译时间的最高投票答案问题,这里有一个带有预编译模式的版本:
from timeit import timeit
import re
def find(string, text):
if string.find(text) > -1:
pass
def re_find(string, text_re):
if text_re.match(string):
pass
def best_find(string, text):
if text in string:
pass
print timeit("find(string, text)", "from __main__ import find; string='lookforme'; text='look'")
print timeit("re_find(string, text_re)", "from __main__ import re_find; string='lookforme'; import re; text_re=re.compile('look')")
print timeit("best_find(string, text)", "from __main__ import best_find; string='lookforme'; text='look'")
Run Code Online (Sandbox Code Playgroud)
还有我的数字:
0.189274072647
0.239935874939
0.0820939540863
Run Code Online (Sandbox Code Playgroud)
预编译模式提高了数字,但仍然in更快。
小智 5
我遇到了同样的问题.我使用Jupyter的%timeit来检查:
import re
sent = "a sentence for measuring a find function"
sent_list = sent.split()
print("x in sentence")
%timeit "function" in sent
print("x in token list")
%timeit "function" in sent_list
print("regex search")
%timeit bool(re.match(".*function.*", sent))
print("compiled regex search")
regex = re.compile(".*function.*")
%timeit bool(regex.match(sent))
Run Code Online (Sandbox Code Playgroud)
句子中的x为每循环61.3 ns±3 ns(平均值±标准偏差,7次运行,每次10000000次循环)
令牌列表中的x每循环93.3 ns±1.26 ns(平均值±标准偏差,7次运行,每次10000000次循环)
正则表达式搜索每循环772 ns±8.42 ns(平均值±标准偏差,7次运行,每次1000000次循环)
编译正则表达式搜索每循环420 ns±7.68 ns(平均值±标准偏差,7次运行,每次1000000次循环)
编译速度很快但是简单的更好.