小编cra*_*rax的帖子

我如何将字符串转换为python中的集合

我有一串

sample = "http://www.stackoverflow.com"

Run Code Online (Sandbox Code Playgroud)

我想将此字符串转换为一组

final = {"http://www.stackoverflow.com"}

Run Code Online (Sandbox Code Playgroud)

我尝试了以下代码：

final = set(sample)

Run Code Online (Sandbox Code Playgroud)

但是我错了

{u'.', u'/', u':', u'a', u'b', u'c', u'e', u'h', u'i', u'k', u'l', u'n', u'p', u's', u't', u'w'}

Run Code Online (Sandbox Code Playgroud)

我也用过

final  = ast.literal_eval(Sample)

Run Code Online (Sandbox Code Playgroud)

我知道了

SyntaxError: invalid syntax

Run Code Online (Sandbox Code Playgroud)

还有其他解决方案吗

python

cra*_*rax

lucky-day

5
推荐指数

3
解决办法

1万
查看次数

如何在Python中的<h1> </ h1>之间提取文本？

我被困在<h1>和之间提取文本</h1>.

请帮我.

我的代码是:

import bs4
import re
import urllib2

url2='http://www.flipkart.com/mobiles/pr?sid=tyy,4io&otracker=ch_vn_mobile_filter_Top%20Brands_All#jumpTo=0|20'
htmlf = urllib2.urlopen(url2)
soup = bs4.BeautifulSoup(htmlf)
#res=soup.findAll('div',attrs={'class':'product-unit'})
for res in soup.findAll('a',attrs={'class':'fk-display-block'}):
    suburl='http://www.flipkart.com/'+res.get('href')
    subhtml = urllib2.urlopen(suburl)
    subhtml = subhtml.read()
    subhtml = re.sub(r'\s\s+','',subhtml)
    subsoup=bs4.BeautifulSoup(subhtml)
    res2=subsoup.find('h1',attrs={'itemprop':'name'})
    if res2:
        print res2

Run Code Online (Sandbox Code Playgroud)

输出:

<h1 itemprop="name">Moto G</h1>
<h1 itemprop="name">Moto E</h1>
<h1 itemprop="name">Moto E</h1>

Run Code Online (Sandbox Code Playgroud)

但我想要这个:

Moto G
Moto E
Moto E

Run Code Online (Sandbox Code Playgroud)

html python tags extract beautifulsoup

cra*_*rax

2014 08-26

3
推荐指数

1
解决办法

3892
查看次数

标签统计

python ×2

beautifulsoup ×1

extract ×1

html ×1

tags ×1

我如何将字符串转换为python中的集合

如何在Python中的<h1> </ h1>之间提取文本？

标签 统计

小编cra_rax的帖子

标签统计