Muh*_*ran 77 python beautifulsoup
我已经使用easy_install安装了BeautifulSoup并尝试运行以下脚本
from BeautifulSoup import BeautifulSoup
import re
doc = ['<html><head><title>Page title</title></head>',
'<body><p id="firstpara" align="center">This is paragraph <b>one</b>.',
'<p id="secondpara" align="blah">This is paragraph <b>two</b>.',
'</html>']
soup = BeautifulSoup(''.join(doc))
print soup.prettify()
Run Code Online (Sandbox Code Playgroud)
但不确定为什么会这样
Traceback (most recent call last):
File "C:\Python27\reading and writing xml file from web1.py", line 49, in <module>
from BeautifulSoup import BeautifulSoup
ImportError: No module named BeautifulSoup
Run Code Online (Sandbox Code Playgroud)
能否请你帮忙.谢谢
小智 212
试试这个 from bs4 import BeautifulSoup
这可能是Beautiful Soup,第4版和beta版的问题.我刚在主页上看到了这个.
Cau*_*ons 20
在Ubuntu 14.04上我从apt-get安装它并且工作正常:
sudo apt-get install python-beautifulsoup
然后就做:
from BeautifulSoup import BeautifulSoup
小智 8
试试这个,我这样做了.要获取任何标记数据,只需将"a"替换为您想要的标记即可.
from bs4 import BeautifulSoup as bs
import urllib
url="http://currentaffairs.gktoday.in/month/current-affairs-january-2015"
soup = bs(urllib.urlopen(url))
for link in soup.findAll('a'):
print link.string
Run Code Online (Sandbox Code Playgroud)
小智 6
您可以导入 bs4 而不是 BeautifulSoup。由于 bs4 是内置模块,因此无需额外安装。
from bs4 import BeautifulSoup
import re
doc = ['<html><head><title>Page title</title></head>',
'<body><p id="firstpara" align="center">This is paragraph <b>one</b>.',
'<p id="secondpara" align="blah">This is paragraph <b>two</b>.',
'</html>']
soup = BeautifulSoup(''.join(doc))
print soup.prettify()
Run Code Online (Sandbox Code Playgroud)
如果要请求,请使用 requests 模块。请求正在使用urllib
,requests
模块。但我个人建议使用requests
模块而不是urllib
模块安装使用:
$ pip install requests
Run Code Online (Sandbox Code Playgroud)
以下是使用请求模块的方法:
import requests as rq
res = rq.get('http://www.example.com')
print(res.content)
print(res.status_code)
Run Code Online (Sandbox Code Playgroud)
归档时间: |
|
查看次数: |
92372 次 |
最近记录: |