小编Noo*_*der的帖子

无法通过 BeautifulSoup 抓取

我正在尝试从该网站抓取图片和新闻网址。我定义的标签是

root_tag=["div", {"class":"ngp_col ngp_col-bottom-gutter-2 ngp_col-md-6 ngp_col-lg-4"}]
image_tag=["div",{"class":"low-rez-image"},"url"]
news_url=["a",{"":""},"href"]
Run Code Online (Sandbox Code Playgroud)

和 url 是url,我用于抓取网站的代码是。

ua1 = 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)'
ua2 = 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_9_5) AppleWebKit 537.36 (KHTML, like Gecko) Chrome'
headers = {'User-Agent': ua2,
           'Accept': 'text/html,application/xhtml+xml,application/xml;' \
                     'q=0.9,image/webp,*/*;q=0.8'}
session = requests.Session()
response = session.get(url, headers=headers)
webContent = response.content
bs = BeautifulSoup(webContent, 'lxml')
all_tab_data = bs.findAll(root_tag[0], root_tag[1])

result=[]
for div in all_tab_data:
    try:
        news_url=None
        news_url = div.find(news_tag[0], news_tag[1]).get(news_tag[2])
        
    except Exception as e:
        news_url= None
    
    try: …
Run Code Online (Sandbox Code Playgroud)

python beautifulsoup web-scraping

2
推荐指数
1
解决办法
76
查看次数

错误:'List' 对象在 map() 函数中不可调用

def powerof(num):
    return num**2

number = [1,2,3,4,5,6,7,8]
s = list(map( powerof , number))
print(s)
Run Code Online (Sandbox Code Playgroud)

错误:“列表”对象不可调用

python list typeerror map-function

1
推荐指数
1
解决办法
1万
查看次数