Dun*_*can 16
不直接拆分字符串,但re模块具有re.finditer()(和finditer()任何编译的正则表达式上的相应方法).
@Zero问了一个例子:
>>> import re
>>> s = "The quick brown\nfox"
>>> for m in re.finditer('\S+', s):
... print(m.span(), m.group(0))
...
(0, 3) The
(4, 9) quick
(13, 18) brown
(19, 22) fox
Run Code Online (Sandbox Code Playgroud)
像s.Lott一样,我不太清楚你想要什么.以下是可能有用的代码:
s = "This is a string."
for character in s:
print character
for word in s.split(' '):
print word
Run Code Online (Sandbox Code Playgroud)
还有s.index()和s.find()用于查找下一个字符.
后来:好的,这样的事情.
>>> def tokenizer(s, c):
... i = 0
... while True:
... try:
... j = s.index(c, i)
... except ValueError:
... yield s[i:]
... return
... yield s[i:j]
... i = j + 1
...
>>> for w in tokenizer(s, ' '):
... print w
...
This
is
a
string.
Run Code Online (Sandbox Code Playgroud)
| 归档时间: |
|
| 查看次数: |
11647 次 |
| 最近记录: |