Rah*_*jan 5 html python beautifulsoup
提取数字后跟和前面的单词:
String q = 'Consumer spending in the US rose to about 62% of GDP in 1960, where it stayed until about 1981, and has since risen to 71% in 2013'
q = re.findall(r'^([^\d]+)\s(\d+)\s*,\s*([^\d]+)\s(\d+)',s)
Run Code Online (Sandbox Code Playgroud)
它给出了给定中所有单词和数字的列表q.所以现在我想要方法来获得数字和单词
根据你的描述,我猜你需要这样的东西:
>>> import re
>>> strs = 'Consumer spending in the US rose to about 62% of GDP in 1960, where it stayed until about 1981, and has since risen to 71% in 2013'
>>> re.findall(r'\w+\s\d+.*?\s\w+',strs)
['about 62% of', 'in 1960, where', 'about 1981, and', 'to 71% in']
Run Code Online (Sandbox Code Playgroud)